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Preface 


The present book is based on lectures given by the author over 
a number of years to students of various colleges studying engineering 
and physics. The book includes some optional material which can 
be skipped for the first reading. The corresponding items in the 
table of contents are marked by an asterisk. 

In designing this course the author tried to select the most impor- 
tant mathematical facts and present them so that the reader could 
acquire the necessary mathematical conception and apply mathema- 
tics to other branches of science. Therefore in most cases the author 
did not give rigorous formal proofs of the theorems and intentionally 
simplified their statements referring the reader to characteristic 
particular cases and obvious examples. The rigorousness of a proof 
often fails to be fruitful and therefore it is usually ignored in practical 
applications. Some purely mathematical stipulations are made in 
the book only in the cases when they help the reader to avoid mis- 
conception in theory and application. Mathematical facts and 
objects which can be regarded as exceptional from the point of 
view of applied science are not even mentioned in the book. (For 
instance, when we speak about “all functions” we do not include 
the functions which are not Lebesgue measurable and even such 
functions as the everywhere discontinuous Dirichlet function and 
the like.) We tried to demonstrate the meaning of the basic mathe- 
matical concepts and to give a convincing explanation of the most 
important mathematical facts on the basis of intuitive notions. 
It is the author’s belief that in applied mathematics an explanation 
of this kind should be regarded as a proof. Such an approach is 
characteristic of applied mathematics whose main purpose is to 
provide an adequate qualitative description of a phenomenon and 
obtain the numerical solution of the corresponding problem in the 
most economical manner without exerting unnecessary effort. This 
approach essentially differs from that of pure mathematics whose 
corner-stone is the logical consistency of all the considerations based 
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only on the concepts which have an exhaustive logical foundation. 
The author is sure that it is the aspects of applied mathematics that 
must determine the character of mathematical education of an 
engineer and physicist. (But of course’ a teacher of mathematics 
should have a good-command both of pure and applied mathematics.) 

These ideas of the author concerning mathematical education 
(represented in greater detail in his article on applied mathematics 
published in the journal Vestnik Vysshei Shkoly, 1967, No. 4, pp. 74- 
80) are still difficult to realize consistently. Therefore the author 
will be grateful to the readers for any advice and criticism. 

The book is composed in such a way that it is possible to use it 
both for studying in a college under the guidance of a teacher and 
for self-education. The subject matter of the book is divided into 
small sections so that the reader could study the material in suitable 
order and to any extent depending on the profession and the needs 
of the reader. It is also intended that the book can be used by 
students taking a correspondence course and by the readers who 
have some prerequisites in higher mathematics and want to perfect 
their knowledge by reading some chapters of the book. For this 
purpose we sometimes refer the reader to supplementary books (the 
bibliography is placed at the end of this course; the references are 
indicated by numbers in square brackets). We also supply the book 
with the name index, subject index and the list of symbols which 
enable the reader to find a desired definition, term or symbol. 

In some colleges analytic geometry and linear algebra are studied 
as independent courses. The structure of the book facilitates such 
a separation: the fundamentals of analytic geometry and linear 
algebra are given in Chapters II, VI, VII, X and XI. 

Some attention should be paid to the way of the numeration of 
the formulas and sections in the book. The sections entering into 
each chapter are numerated in succession beginning with the first 
number. In references inside each chapter we omit the number of the 
chapter. For instance, the expression “formula (2)” placed in the 
text of Chapter VI means “formula (2) of Chapter VI”. But when 
formula (2) of Chapter VI is mentioned in some other chapter we 
write “formula (VI.2)”. Similarly, “§ 11.3” means “§ 3 of Chapter II” 
but we simply write “§ 3” when § 3 of Chapter II is referred to in 
this chapter; the expression “Sec. V.6” means “Sec. 6 of Chapter V” 
and so on. 

Studying the theoretical material should be followed by solving 
problems and doing exercises. For this purpose we can recommend 
the well-known collections of problems [2], [4], [26] and [47]. But it 
should be noted that some divisions of applied mathematics are 
not treated to a sufficient extent in these collections and therefore 
it is advisable that a teacher of mathematics should add some inte- 
resting and instructive problems concerning these divisions. 
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The book can be of use to readers of various professions dealing 
with applications of mathematics in their current work. Modern 
applied mathematics contains, of course, many important special 
divisions which are not included in this book. The author intends 
to write another book devoted to somé supplementary topics such 
as the theory of functions of a complex argument, variational calculus, 
mathematical physics, some special questions of the theory of ordi- 
nary differential equations and so on. 

When preparing the book for the second edition the author con- 
siderably revised the text and added some new material including 
the chapter on the theory of probability*. Besides, the author has 
taken into account valuable advice and criticism received from 
many mathematicians, in particular from the members of the Mos- 
cow mathematical society where the book was discussed. Some 
sections of the book were written or revised under the influence of 
ideas and useful comments of L. M. Altshuler, Ya. B. Zeldovich 
and B. O. Solonouts. To all of them the author expresses his warmest 
gratitude. 

A. D. Myskis 


April 19, 1966 


* This edition is the English translation of the second Russian edition 
of the book. The first Russian edition contained a chapter in which a brief 
review of basic equations of mathematical physics was given. The chapter was 
excluded from the second edition because of some changes in the syllabus of tech- 
nical colleges. We have included the material of this chapter in this English 
edition as the Appendix at the end of the book. The present translation incorpo- 
rates suggestions made by the author.—Tr. 
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Introduction 


1. The Subject of Mathematics. Numerical calculations are pene- 
trating into the fields of work of physicists, chemists and engineers 
of various specialities. The modern development of science and 
engineering makes it necessary to deduce and apply still more com- 
plicated laws, to solve very complicated problems and perform 
extensive calculations. 

All such calculations are based on mathematics, the science which 
treats of relations existing between spatial forms, quantities and 
magnitudes of the real world. All the basic notions of mathematics 
emerged and were developed in connection with the demands of 
natural sciences (physics, mechanics, astronomy etc.) and enginee- 
ring. The appearance of more complicated problems led to the crea- 
tion of more sophisticated mathematical methods of investigation 
(i.e. mathematical rules, techniques, formulas and the like) and, 
in particular, to the foundation of higher mathematics. It is there- 
fore not accidental that the fundamentals of higher mathematics 
were created in the 17th and 18th centuries, i.e. at the beginning 
of an intensive development of industry, although some elements 
of higher mathematics appeared as early as antiquity in the works. 
of the great Greek mathematician and mechanician Archimedes 
(287-212 B.C.). 

Higher mathematics was founded in the works of the prominent 
French philosopher, physicist, mathematician and physiologist 
R. Descartes (1596-1650), the great English physicist, mechanician, 
astronomer and mathematician I. Newton (1642-1727), the great 
German mathematician and philosopher G. Leibniz (1646-1716), 
the great mathematician, mechanician and physicist L. Euler 
(1707-1783) and many other famous scientists. In their works diffe- 
rent divisions of mathematics were created for investigating phe- 
nomena of nature and solving engineering problems. In mathematics, 
as in other sciences, practical work is the main source of scientific 
discoveries. Another important source is. the need of mathematics 
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itself to systematize the facts discovered, to investigate their inter- 
relations and so on. 

2. The Importance of Mathematics and Mathematical Education. 
Mathematics and, in particular, higher mathematics plays a very 
important role in modern natural science and engineering. Mathema- 
tics lies in the foundation of all divisions of physics, mechanics 
and many divisions of other natural sciences, engineering and some 
other branches of knowledge. Designing the construction of an 
airplane or a dam of a hydro-electric power station, investigating 
complicated processes involved in deformation of metals, propa- 
gation of radio-waves, diffusion of neutrons in an atomic reactor etc. 
cannot be performed without systematic application of mathe- 
matics. 

The high level of development of computational methods in the 
USSR is one of the main factors that led to the triumphant achieve- 
ments in launching the first artificial satellites of the Earth, space 
rockets and spacecraft. The creation of high-speed electronic compu- 
ters and other mathematical automatic devices leads to further 
extension of the application of higher mathematics and facilitates 
the introduction of computational methods into many new fields. 
In particular, this is the case in such fields as economics, manage- 
ment and control of industry, elaborating optimal (i.e. the best) 
plans of capital investments or construction, transportation problems, 
controlling technological processes, dispatching and so on. The 
application of mathematical methods in these fields has already 
proved to be very effective and profitable. In recent years mathema- 
tics has been penetrating into such traditionally “non-mathematical” 
fields as biology, physiology, geography etc. 

Therefore nowadays the requirements for mathematical educa- 
tion of an engineer are very high. An engineer must know the basic 
principles of higher mathematics and be able to apply them to con- 
‘crete problems. Then mathematics will become a powerful tool 
in his hands. Besides, a great deal of scientific literature and many 
special technical subjects are saturated with mathematical techni- 
ques and formulas. Without sufficient knowledge of mathematics 
much effort is needed to understand all these formulas which may 
hinder the reader’s work and mislead him. Mathematics also faci- 
litates a better understanding of many questions related to other 
sciences (the theory of vibrations, mechanics of continuous media 
and so on). 

3. Abstractness. Mathematics itself is not a technical subject 
and therefore a course in mathematics for engineers and scientists 
must not treat any special technical questions. Its aim is to provide 
the necessary mathematical education. Therefore a student may 
sometimes feel that the questions treated in a course of higher mathe- 
matics are too abstract. But the abstractness of mathematics is one 
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i ane 
of its most essential features. This does notat alf mean that \nntile- 
matics has little to do with practical activities. Qn the conte, 
it is the possibility to apply mathemati¢s'to various-kinds ofcacti 
vities that makes its abstractness so inipertant. For’ instance, in 
geometry we consider an “abstract” cylinder and-find its. volume. 
This immediately enables us to compute the volume, of ańy concrete 
cylinder no matter whether it is a component ofa_mechanism or 
a column or a portion of space occupied by an electric field. Simi- 
larly, in higher mathematics we deduce some general abstract laws 
whose statements are not directly connected with a particular form 
of practical activity, a natural science or engineering, but the con- 
crete applications and realizations of these laws (which are studied 
as examples in a mathematical course) are always related to various 
phenomena of the ral world. 

Thus, mathematics considers pure, ideal (schematized) forms, 
relations, processes etc. whose realization serves only as an appro- 
ximation to reality. For instance, a real cylinder can never be a 
perfect cylinder from the mathematical point of view. Here we see 
the manifestation of a distinguishing feature characteristic of any 
kind of human cognition: when considering a real object or process 
we always select a number of basic properties from an infinite variety 
of properties of the object or process and investigate these most 
essential properties abstracting them from inessential ones. But 
it may sometimes happen that an assumption, hypothesis, that all 
the properties except those chosen as basic ones are inessential is 
not true and then we can arrive at a contradiction between our 
mathematical inferences and reality. Such a possibility must never 
be forgotten! 

Because of the abstractness of forms and relations the logical 
consistency of inferences in mathematics is extremely important, 
more important than in other sciences, this being well known even 
from elementary mathematics. In higher mathematics too, all 
the assertions must be completely clear and logically justified so 
that it should be possible to regard them as objective laws adequate 
to reality. In mathematics, and particularly in higher mathematics, 
we also sometimes draw certain conclusions from experiment, obser- 
vation and analogy but nevertheless such a situation is rarer in 
mathematics than in other sciences. 

There is a characteristic tendency in mathematics to deduce all 
the assertions from a few basic principles (called axioms). This is 
the so-called deductive method. But in our introductory course 
which is intended for those who are mainly interested in applications 
we shall not rigorously follow this method in all cases. The reader 
interested in theory may find some inferences in our course to be 
imperfect from the point of view of logic. If he wants to get a better 
understanding of some exceptions to general rules and to ig 
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mathematics more thoroughly, he should study a course written 
for mathematicians, for instance, [14]. To consider the same question 
from different points of view it is advisable to take other courses 
intended for technical colleges (for instance, [5], [87], [44] and [49]; 
we particularly recommend book [5]). 

4. Characteristic Features of Higher Mathematics. There is no 
distinct boundary between elementary and higher mathematics, 
the division being conditional. These are not at all different seiences, 
and the division is mainly accounted for by some historical reasons 
as elementary mathematics and higher mathematics were created 
in different historical epochs. But nevertheless we can point out 
‘some characteristic features of higher mathematics. 

One of them is the universality, generality, of its methods. As 
an example, let us take the problem of finding volumes of solids. 
Elementary mathematics gives us different formulas for computing 
the volumes of a prism, pyramid, cone, cylinder, sphere and some 
other simple solids. Each formula is obtained on the basis of a spe- 
cial argument which is rather complicated in certain cases. But in 
higher mathematics we have general formulas expressing the volume 
of any solid, the length of any curve, the area of any surface and 
the like. Take another example. Consider the problem of investiga- 
ting the motion of a material point under the action of given forces. 
In elementary courses in physics (based on elementary mathematical 
methods) we study only uniform rectilinear motion, uniformly acce- 
lerated rectilinear motion, uniformly decelerated rectilinear motion 
and uniform circular motion, and it is rather difficult to investigate 
other types of motion by means of techniques of elementary mathe- 
matics. But the methods of higher mathematics make it possible 
to investigate any type of motion which can be encountered in 
practical problems. 

There is another characteristic feature of higher mathematics 
(related to the above one). It is the systematic consideration of 
variable quantities. When investigating various objects and pro- 
cesses by means of elementary mathematics we usually regard such 
important quantities as velocities, accelerations, densities, masses, 
forces etc. as being invariable, constant (and yet we attain the 
aim only in some simple cases). But if these quantities vary con- 
siderably (as is often the case) we cannot regard them as being con- 
stant. To solve such problems we usually apply higher mathematics. 
There is a branch of higher mathematics (called differential calculus) 
which is one of the earliest divisions of mathematics particularly 
intended for solving various problems connected with an investi- 
gation of the dependence of one quantity upon another. The quan- 
tities and their interrelations can be of any nature (for instance, 
we can consider the relation between the acceleration, velocity 
and path length of a motion or between the density, mass and force 
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and the like). Therefore differential calculus deeply penetrates into 
various natural sciences and engineering. 

The third characteristic feature of higher mathematics is the 
close relationship between its various divisions and the systematic 
unification of the computational, analytical (based on formulas) 
and geometric methods in contrast to elementary mathematics in 
which the connection between algebra and geometry is more or 
less accidental. In higher mathematics, the coordinate method re- 
duces geometric problems to solving algebraic equations, graphs 
are used for representing relations between variable quantities, 
analytical methods of integral calculus are applied for computing 
areas and volumes of geometric figures and so on. 

Some historical remarks will be given in due course in this book. 
But it is expedient to make some introductory notes here. The 
most important divisions of higher mathematics which now form 
the basis of the syllabus for engineers of many specialities were 
created in the 17th and 18th centuries. They include the coordinate 
method, differential and integral calculus etc. These divisions are 
represented in courses of higher mathematics for engineers mostly 
in the form they appeared after the works of Euler. L. Euler (a Swiss 
by birth) spent most of his life in Russia and died in Petersburg. 
Most of his works (473 out of 865) were published in Russia. His 
outstanding results in various divisions of mathematics, mechanics, 
physics and other sciences lie in the foundation of these divisions. 

Mathematics was created by scientists of many countries. Among 
Russian mathematicians we should mention N. I. Lobachevsky 
(1792-1856), the creator of a non-Euclidean geometry. He also obtai- 
ned some important results in other divisions of mathematics and 
initiated mathematical studies in Kazan. An intensive development 
of mathematics in Petersburg began with the works of the prominent 
mathematician Academician M. V. Ostrogradsky (1804-1862). The 
founder of the famous Petersburg mathematical school was the 
great Russian mathematician and mechanician Academician 
P. L. Chebyshev (1821-1894). He obtained many important results 
in various fields of mathematics and its applications to the theory 
of mechanisms, cartography etc. 

After Chebyshev most prominent representatives of the Peters- 
burg mathematical school were Academician A. A. Markov (1856- 
1922), a famous mathematician and the creator of the theory of 
random processes, and Academician A. M. Lyapunov (1857-1918), 
the founder of the theory of stability. 

Since the second half of the 19th century mathematical investi- 
gations have been developing in Kiev, Moscow, Odessa, Kharkov, 
and other Russian towns. 

5. Mathematics in the Soviet Union. In the Soviet Union there 
are many centres of mathematical research. Among prominent 
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Soviet mathematicians we should mention Academicians A. D. Alek- 
sandroy, P. S. Aleksandrov, N.: N. Bogolyubov, V. M. Glushkov, 
L. V. Kantorovich, M. V. Keldysh, A. N. Kolmogorov, M. A. Lav- 
rentyev, Yu. V. Linnik, N. I. Muskhelishvili, P. S. Novikov, I. G. 
Petrovsky, L. S. Pontryagin, V. I. Smirnov, S. L. Sobolev, A. N. 
Tikhonov, I. N. Vekua, I. M. Vinogradov, and others. 

Mathematics is being intensively developed both in old centres 
and in new ones in Baku, Erevan, Gorki, Lvov, Minsk, Novosibirsk, 
Rostov, Saratov, Sverdlovsk, Tashkent, Tbilisi, Vilnyus, Voronezh, 
and other towns. 

The role of mathematics in other sciences, industry and enginee- 
ring has considerably increased. Many mathematicians work out 
new theoretical problems of other branches of knowledge connected 
with applications of mathematics. At the same time many physi- 
cists, mechanicians and engineers take part in the development 
and applications of those divisions of mathematics which are related 
to their fields of work. As examples of fruitful unification of mathe- 
matics and its applications we can mention the works of the great 
Russian scientist and one of the founders of modern flight mechanics 
and hydromechanics N. E. Zhukovsky (1847-1921), the prominent 
Russian scientist, mathematician, mechanician and naval architect 
Academician A. N. Krylov (1863-1945), the prominent Soviet 
scientist in the fields of theoretical mechanics, aerodynamics and 
SES e Academician S. A. Chaplygin (1869-1942) and 
others. 

There is no doubt that development of mathematical education 
will further increase the role of mathematics in our life and yield 
fruitful results. 


CHAPTER I 


Variables and Functions 


§ 1. Quantities 


1. Concept of a Quantity. It is difficult to give a strict definition 
of a quantity since the notion is extremely general and universal. 
Masses, pressures, charges, different kinds of work, lengths and 
volumes are examples of quantities. It will be sufficient for our 
further aim to regard as a quantity everything that is expressible in 
certain units and completely characterized by its numerical value. 
For instance, masses are measured in grams or kilograms and the 
like. We can say that the area of a circle is a quantity since it is 
completely characterized by its numerical value (for example, 5, x 
etc.) if we measure it in certain units, e.g. in square centimetres. 
The circle itself regarded as a geometric figure is of course not a quan- 
tity because it is characterized by a certain geometric form which 
cannot be expressed numerically. 

Many notions which were originally understood only in a quali- 
tative aspect have been recently “advanced” and transferred to the 
class of quantities (for instance, such notions as effectiveness, infor- 
mation and even likelihood). Every change of this kind is a great 
event since it enables us to apply quantitative mathematical me- 
thods to investigating the corresponding notions and this usually 
turns out to be very effective. 

2. Dimensions of Quantities. A unit measure which is used for 
expressing a quantity is called the dimension of the quantity. For in- 
stance, the gram or the kilogram usually serves as the dimension 
of mass. The dimension of area is the square centimetre or the square 
metre and so on. A dimension is denoted by square brackets. For 
instance, if M is a mass and S is an area then [M] = kg (the kilo- 
gram) and [S] = m? (the square metre) in the international system 
of units. 

Usually the units of some quantities are regarded as fundamental 
units whereas the units of all other quantities are derived units 
expressed in terms of the fundamental ones. For instance, the units 
of length (m), of mass (kg) and of time (sec) are the fundamental 
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units in the international system of units (SI), and the unit of velo- 
city (m/sec) or of force (kg-m/sec?) is expressed in terms of the fun- 
damental units. 

We can add together and subtract only quantities of the same 
dimension, the dimension of a sum being that of the summands. 
It is permissible to multiply or divide quantities of arbitrary di- 
mensions. The multiplication or division of quantities yields, res- 
pectively, the multiplication or division of their dimensions. 

We also consider dimensionless (“abstract”) quantities. For instance, 
the ratio of two quantities of the same dimension is dimensionless. 
The numerical value of the ratio of a quantity to the chosen unit 
measure is also dimensionless. For example, the numerical value 
of the mass of 5 kg is the “dimensionless mass” 5. We can also obtain 
a dimensionless mass if we take the ratio of the mass to a certain 
mass which is characteristic of the process in question (such a mass 
is supposed to be well known, and we choose it as a standard to 
compare with). Dimensionless length, time etc. are introduced in 
like manner. 

In mathematics we usually regard quantities as dimensionless. 
Finally, a dimensionless quantity is completely characterized by 
its numerical value, and its “unit measure” is the number 1. 

3. Constants and Variables. A. quantity entering into an investi- 
gation can take on either different values or only one fixed value. In 
the first case we call the quantity a variable quantity or, in short, 
a variable, and in the second case we call the quantity a constant 
(a constant quantity). Suppose we consider the water in a basin. 
The water pressure measured at different points of the basin is 
a variable since it varies and is different at different points. At the 
same time the water density can be regarded as a constant since 
it takes on one and the same value (with a sufficient degree of accu- 
racy) at different points. As another example let us consider the 
process of compressing a given mass of a gas while the temperature 
is kept constant. Then the pressure and the volume are variables 
whereas the mass and the temperature are constants. But it should 
be noted that in a real process the last two quantities inevitably 
vary a little. Hence, we can schematize the process and conditionally 
regard the mass and the temperature as constants only in case their 
real variations are of no importance for our investigation. And in 
many other cases the constancy of some quantities should be under- 
stood in a conditional sense. We must never forget itsince, if we 
regard a quantity as a constant in a process in which the variations 
of the quantity, small though they may be, are essential for the 
investigation, we may arrive at wrong conclusions and our schema- 
tized model will not apply. 

It may happen that a quantity which is constant in a certain 
treatment of a phenomenon takes on a different value or even becomes 
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a variable under some other (though similar) circumstances. The 
constant quantities of this kind are called the parameters of the 
process; they are the characteristics of the process. For example, 
the mass and the temperature of a gas are the parameters of the 
process of isothermal compression. When we deal with an electric- 
light bulb we take into account such parameters as the resistance, 
the supply voltage the bulb is designed for and the power consump- 
tion. Even in this case there are some other parameters which may 
also be taken into account (for instance, the sizes of the bulb) but 
usually we do not regard these parameters as basic ones. Generally, 
in all cases it is very important to choose the basic, the most signi- 
ficant parameters among various parameters characterizing an object. 

4. Number Scale. Slide Rule. Quantities can be represented 
visually by means of a number scale. For this purpose we usually 
take a rectilinear axis with a uniform scale. To construct a number 
scale we choose a straight line and a point on the line which serves 
as the origin (the origin is usually designated by the letter 0). We 
choose one of the directions on the straight line as the positive 
direction and take a certain line segment as a unit of length (the 
positive direction is indicated by an arrow; see Fig. 1). Setting off 
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the unit segment from the origin in both directions and repeating 
the procedure infinitely we obtain all the points which correspond 
to the integral values of the quantity. Between the “integer points” 


there are points representing fractional values, both rational (such 
1 


as g? —2.03 ete.) and irrational (that is fractional numbers that 


are not rational, e.g. v —x and the like). In case we have 


a dimensional quantity the segment chosen as the length unit also 
acquires the corresponding dimension. For example, the numerical 
values of time ¢ depicted in Fig. 1 are expressed in seconds; we also 
see the points N (t = —1 sec), O (t = 0 sec) and M (t = 1.37 sec) 
there. 

To each value of the quantity there corresponds a certain point on 
the number scale and, conversely, each point on the number scale cor- 
responds to a certain value of the quantity. Besides, there is a one- 
to-one correspondence between the values of the quantity and the 
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points, that is each point corresponds only to one value and vice 
versa. (Here and further we consider real quantities, that is quan- 
tities which take on only real numerical values; complex quantities 
will be treated in Sec. VIII.1.) On the basis of these properties 
we often identify the values of a quantity with the corresponding 
points; we simply say “the point ¢ = 1.37 sec” and the like. 

If a quantity is variable it is represented by a point which can 
occupy different positions on the axis (on the number scale). For 
example, such a point can move along the axis as time passes. In 
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Fig. 2 


case the quantity is constant the corresponding point occupies 
a fixed position and does not move. A point which represents a 
variable is called a variable point (moving point, current point). 

In practical applications we try to choose the origin and the unit 
of length in such a way that the range of variations of the quantity 
should be represented in the‘ most suitable manner. The origin 
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itself is sometimes not depicted because we can draw only a part 
of the infinite axis. For example, Fig. 2 represents the number scale 
on which the values of the length of a rod subjected to thermal expan- 
sion are shown. 

It is sometimes convenient to use scales which are non-uniform. 
For instance, logarithmic scales are often of use (see Fig. 3). A num- 
ber n œ> 1 is represented on such a scale by a point which is ob- 
tained by drawing the line segment of length k log n (where k is 
a factor of proportionality suitably chosen) in the positive direction 
from a point A. Positive numbers n < 1 are obtained on the loga- 
rithmic scale by drawing the segment flog | in the negative 
direction from A because for such n we have logn < 0. 

A logarithmic scale is, in particular, utilized in the construction 
of a slide rule. The instrument consists of a ruler and a slide which 
are graduated with similar logarithmic scales. To understand the 
principle of a slide rule let us suppose that the scales are shifted 
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with respect to each other (see Fig. 4) so that two points a and b 
on the lower scale coincide with the corresponding points a’ and b’ 
on the upper scale. Then we have 

k log b — k log a =k log b’ — k loga’ 


since the lengths of the shaded line segments are equal. Now after 
some simple transformations we obtain (check it up!) 


t= (1) 


@ 
Three of the values a, b, a’ and b’ being given, we can read the fourth 
value on the slide rule. This fourth value will satisfy relation (1). 


If, for example, we put a’ = 1 then b = ab’ ora = +. Consequently, 


to determine the product of two given numbers a and b’ we must 
make the point 1 on the slide coincide with the point a on the ruler 


Fig. 4 


and then read the value of the product which is indicated on the 
ruler by the point b’ of the slide. (Think how to find the quotient 
of two given numbers.) It is sometimes convenient to put b = 10 
instead of a’ = 1 and to move the slide not to the right but to the 
left with respect to the ruler. The slide rule was invented in the 
17th century. It is widely used now and facilitates the work of 
many technicians, engineers, physicists etc. Supplementary scales 
on the slide rule make it possible to perform various additional 
operations including extracting roots, taking logarithms, raising, 
solving equations of different types and so forth. There is a number 
of handbooks on using the slide rule, for example, [36] and [43] 
to which we refer the reader. 

Some curvilinear scales are also of use in certain cases (for example, 
see Sec. IX.1). But in our course we shall usually use rectilinear 
axes and uniform scales for representing quantities unless the con- 
trary is explicitly stated. 

5. Characteristics of Variables. A variable which takes on all 
the numerical values or all the values lying between some limits 
is called continuous. On the contrary, a variable which assumes 
certain “separated” values is called discrete. 


—_— 
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The set of all numerical values which may be assumed by a variable 
is called the range of the variable. 

Now we introduce the notion of an interval which is of use for 
characterizing ranges of some types of variables. 

A finite (bounded) interval is the set of all numbers contained 
between two given numbers a and b. The numbers a and b are called 
the end-points of the interval. The end-points a and b may or may 
not be included into the interval and this fact should be sometimes 
indicated. Respectively, in the first case we call the interval closed 
(i.e. when a <= z <b and the end-points are thus included) and 
denote it as la, b] and in the second case we say that the interval 
is open (i.e. a < x < b and the end-points are excluded) and denote 
it by (a, b). Finite intervals are represented by line segments on 
the. number scale. 

There are also unbounded (infinite) intervals for which a or b 
or both a and b may be infinite. For example, if a variable z assumes 
all possible values greater than some constant number a the range 
of the variable is described by the inequalities a < z < oo. This 
is an example of an infinite interval; it has no finite right end-point, 
of course, but in such a case we say, conditionally, that the right 
end-point is at infinity. An interval of this kind is also said to have 
no upper bound since in case a variable may increase unlimitedly 
we usually interpret the variable as “rising up”. The notion of a 
lower bound is understood in just like manner. The collection of 
all real numbers is an interval with neither lower nor upper bound — 
(that is, geometrically, the whole number scale). 

The range of a continuous variable is an interval or a collection 
of some number of intervals. For example, if a triangle ABC is defor- 
med in all possible ways the corresponding angle A is a continuous 
variable whose range is the interval 0 < Z A < a (in case the nume- 
rical values of the angle are expressed in radians). At the same time 
the area S of the triangle has the interval 0 < S < oo as its range 
(of course, here we also mean that the numerical values of the area 
are measured in certain units but we are not going to mention details 
of this kind in all cases in future). The range of a discrete variable 
is a set (finite or infinite) of separate real numbers. We can also 
say, in the geometric sense, that such a range consists of separate 
points (but not of entire intervals). For example, let an index assume 
the values 1, 2, ..., n. Then it is a discrete variable. 

If a variable changes in a certain process in such a way that its 
numerical values vary only in one direction, that is they either 
increase or decrease, it is called monotonic. The point representing 
a monotonic variable on a number scale moves in one direction. 

It is inconvenient to consider constant quantities apart from 
variables and therefore we can regard a constant quantity as a spe- _ 
cial case of a variable, i.e. a variable which all the time assumes 
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only one fixed value (the same idea is used in mechanics when the 
state of rest is regarded as a special case af motion). The range of 
a constant consists of only one point, 

We say that a variable changing in a certain process has an upper 
bound if all the time it remains smaller than a constant (such a 
constant is called an upper bound of the variable; it is clear that 
a variable having an upper bound has in fact an infinitude of upper 
bounds because every constant greater than a given upper bound 
of the variable can serve as a new upper bound). We likewise define 
the notion of a lower bound of a variable. Of course, a variable 
may not have an upper or a lower bound (or either of them), If a 

jrih [x-c] <A 
zee | ae | 
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variable has both an w bound and a lower bound it is simply 
called a bounded . Variables having upper bounds (lower 
bounds) are called bounded above below). 

When investigating different quantities we often use the notion 


jJeaj=eifea>O and jal = ~a ifae<d 
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Absolute values possess the following simple properties: 
1. ja+ db) <— ja) + |b]. The inequality is strict in case a 
and } have opposite signs and it turns into the equality if otherwise. 
2. For any a and b we have 


jab |= lal ]ò] and V= ja} 


The significance of the last formula is sometimes underestimated 
in elementary mathematical courses and this may be the cause of 
different errors and false conclusions. 

The quantity |a — b |= |b—a | is equal to the distance 
between the points a and 6 lying on the number scale. The inequality 
|z |< h (h >0) defines the interval —A < z< h, and the ine- 
quality |z —a |< h defines the interval —h < z — a < h, ic. 
a—ħ<z<a -+ h. An interval of the form a —hħh <z <a -h 
fo callada h-neighbourhood of the point a, The intervals are shaded 
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§ 2. Approximate Values of Quantities 


6. The Notion of an Approximate Value. It is usually impossible 
to speak about the absolutely precise value of a physical quantity. 
For example, we can never determine the exact value of the length 
of a real object. This is so not only because our measurements are 
imperfect but also because of a complex form of the body which 
makes it impossible to indicate exactly the points between which 
the length should be measured. If we recall that the object consists 
of molecules which are in permanent motion we see that the situa- 
tion becomes still more complicated. Moreover, there is a vast majo- 
rity of cases when the determination of a length with a great accuracy 
is inexpedient and senseless even when the modern level of measu- 
rement techniques makes such an accuracy attainable. For instance, 
if we have to design or measure a dwelling-house it would be obvio- 
usly senseless to determine the sizes of the building with the accu- 
racy to within 0.01 mm. The same can be said about masses, pres- 
sures etc. The numerical values of almost all quantities in physics 
and engineering (for example, the values of all continuous variables) 
are therefore approximate. 

Mathematical operations on approximate values of quantities 
are called approximate calculations. There exists a special branch 
of science devoted to approximate calculations and we shall study 
some of its rules later on. A. N. Krylov (1863-1945) was one of the 
initiators of developing approximate calculations in the USSR. His 
book [28] (the first edition was in 4914) still retains its significance. 

The appropriate choice of a degree of accuracy for calculations, 
measurements or for manufacturing machine elements is a very 
important operation. When making such a choice one should take 
into account a great many factors, i.e. our requirements, technical 
means, economy etc. 

7. Errors. Let A be the exact value and a an approximate value 
of a quantity. Then the error, that is the deviation of the appro- 
ximate value from the exact one, is equal to A — a. It may be posi- 
tive or negative. Asarule, we do not know the error exactly since 
the exact value A is unknown. Therefore we usually consider the 
limiting errors a, and a, which form an interval containing the true 
error: 

ay < A—a<ad, ie. ata<A<a+ @ 


Thus the value of the quantity A is estimated from two sides. For 
instance, the formula of the length L = gt*-? mm means that the 
true value of the length lies between 9—0.1 = 8.9mm and9 + 0.2 = 
= 9.2 mm. 

It is sometimes inconvenient to consider two limiting errors and 
therefore we often indicate the maximum absolute error œ, that is 
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a value which exceeds the absolute value of the error: 


|A—a|l<a, ie —a<A—a<a or 
a—a<A<ca+t+a 


For example, suppose that the measurement of a length / results 
in the value 137 cm and that we can guarantee the accuracy of 
0.5 cm. This means that we have a = 0.5 cm and 136.5 cm < l < 
< 137.5 cm. Therefore we can write l? = (137 + 0.5) cm. 

The maximum absolute error of a measurement does not characte- 
rize it completely. For instance, if we are told that the maximum 
absolute error is equal to 1 cm we do not yet know whether this is 
a great error or not. Indeed, for example, if it were the length of 
a whale or of a beetle our judgment would vary respectively. 

The quality of a measurement is better characterized by its maxi- 
mum relative error ô which is calculated by the formula 


eee 
~ fal 
The maximum relative error is dimensionless and we often express 
it in per cent. The value of a relative error is usually rounded for 
the sake of simplicity. For instance, the relative error m t Kent 
.5x 


137 & 
= 0.36 æ 0.4, i.e. we can say that the maximum relative error of 


the measurement is equal to 0.4% (or, rounding, to 0.5%). 

The accuracy of the order of 1% or even of 10% is sufficient for 
many approximate calculations. On the other hand, the precise 
measurement of the frequency of electromagnetic vibrations which 
creates the basis of performing the automatic control of spacecraft 
is carried out by means of a crystal or an atomic clock whose error 
in time observation is about 10-4 sec a day (calculate the correspon- 
ding maximum relative error!). 

8. Writing Approximate Numbers. It is desirable to write an 
approximate number, i.e. an approximate value of a quantity, 
in a form which indicates the degree of accuracy. Therefore an 
approximate number is usually written in such a way that all the 
decimal digits except the last one are correct. The admissible error for 
the last decimal digit must not exceed unity (by the way, if the error 
is a little greater one usually admits it). For instance, if we write 
the value R = 1.35 Q of a resistance we mean that ap = 0.01 Q, 
that is in fact 1.34 Q < R < 1.36 Q. There is a great difference 
between the formulas R = 1.35 Q and R = 1.3500 Q since the 
former indicates that the corresponding calculations were carried 
out with a possible error of 0.01 Q whereas the latter expresses the 
result accurate to 0.0001 Q. If the result of certain calculations 
is R = 2.377 Q but the third decimal digit may be incorrect, or 


38—O0144 


for the example of measuring the length 7 is equal to 
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if we are not interested in the fourth decimal digit, we should round 
off the result and write R = 2.38 Q. 

The number of decimal places to the right of the decimal point 
indicates the maximum absolute error. The total number of correct 
decimal digits (which does not include zeros standing to the left 
of the first nonzero decimal digit) indicates the maximum relative 
error. For instance, the numbers 2.57; 4.7400; 0.015 and 0.00210 
have, respectively, 3, 5, 2 and 3 correct decimal digits. The greater 
the number of correct decimal digits, the smaller the maximum 
relative error. 

One should avoid writing expressions of the form M = 1800 g 
since such a form often does not indicate the real accuracy of measu- 
rements or calculations. If we suspect that the second decimal digit 
may be incorrect we must write M =1.8 x 10° g, and if the fourth 
decimal digit is doubtful we must write M = 1.800 x 10° g. Strictly 
speaking, the formula M = 4800 g means that the maximum absolute 
error is equal to 1 g. When the rules of writing approximate numbers 
are not followed we often encounter various misunderstandings. 

9. Addition and Subtraction of Approximate Numbers. Let us take 
an example. Suppose we have weighed a bottle and its cork. Let 
their masses be, respectively, M = 323.1 g and m = 5.722 g (this 
indicates that the scales taken for weighing the cork are more precise). 
It would be wrong to calculate the total weight of the bottle together 
with the cork as 

M =3823.1 
m= 5.722 


M4 m= 328.822 g 


Actually, the weight of the bottle is found with an accuracy 
of 0.1 g and therefore the hundredths and the thousandths entering 
into the result are not only unnecessary but even misleading: judging 
by the answer one might think that the value of M + m is accurate 
to 0.001 g which is wrong. To perform the addition correctly we 
must therefore round off m to 0.1, that is we must calculate in the 
following way: 

M =323.1 
Meo 
M-+m=328.8 g 


Of course, we should obtain the same result if we rounded the former 
result but such an operation involves the unnecessary calculations 
which should have been avoided. Thus, the number of decimal digits 
entering into a sum must be the same as in the summand with the 
greatest absolute error. ‘ 

If there are many summands the round-off errors may add up 
and this may result in a great error in determining the sum (some 


VARIABLES AND FUNCTIONS 35 


kind of systematic “short measure”). There is a rule of a reserve 
decimal digit which is recommended for such cases: the calculations 
are carried out to an extra decimal digit and then we round off 
the result discarding the reserve decimal digit after the sum has been 
calculated. 

For example, let it be required to find the sum 


K = 132.7 + 1.274 + 0.06321 + 20.96 + 46.1521 


The first summand has the largest absolute error which is equal 
to 0.4. Therefore we round off all the other summands to 0.01: 


132.7 + 1.27 + 0.06 + 20.96 + 46.15 = 201.14 


Now, rounding, we find K = 201.1. If we did not use the rule the 
result would be less accurate: 


K = 132.7 + 1.3 + OA + 24.0 + 46.2 = 201.3 


Another example. Suppose it is necessary to find the sum N = 
=V5 + V6+ V7+ V8 with an accuracy of 0.01. The integers 
under the radical signs are regarded as quite exact. Using the rule 
of a reserve decimal digit we take from a table the values of the 
roots accurate to 0.001: 


2.236 + 2.449 + 2.646 + 2.828 = 10.159 


Thus, V = 10.16. 

If the number of summands is very large, say several hundreds, 
it is advisable to use two reserve decimal digits. 

When we add several summands given with the same number of 
decimal digits to the right of the decimal point we must take into 
account that the maximum absolute error of the sum can be greater 
than those of the summands. It is therefore expedient to round the 
result to the preceding decimal place. For example, let 


L = 1.38 + 8.71 + 4.48 + 11.96 + 7.33 


Adding we get L = 33.86. But the last decimal digit is likely to 
z HEr and therefore we should write the answer in the form 

The maximum absolute error of a sum or of a difference is equal to 
the sum of the maximum absolute errors of the operands. For instance, 
if we have two quantities determined with an accuracy of 0.1 then 
it is easy to understand that the sum and the difference of the quan- 
tities are determined with an accuracy of 0.2 because the errors may 
add up. If there are many summands it is unlikely that all the errors 
would add up. In such cases one should use the methods of the theory 
of probabilities (see Sec. XVIII.15) in order to estimate the error 
of the sum. These methods imply that we should round the sum 
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discarding one decimal digit (as it was done above in calculating L) 
beginning with, approximately, five summands and two decimal 
digits approximately with 500 summands. 

The rules of subtraction are essentially the same as the rules of 
addition of approximate numbers. At the same time we should take 
into account that when subtracting approximate numbers which 
are close to each other we may get a considerable increase of the 
relative error. For instance, let it be necessary to calculate P = 
— 327,48 — 326.91. The minuend and the subtrahend have æ = 0.01 
and therefore ô ~ ee 100% = 0.003%. But the maximum absolute 
error of the difference P = 0.57 is equal to 0.02 and hence its maxi- 


mum relative error is Sp = C% 100% = 3.5%. The relative 


error has thus increased 1000 times! 

Therefore one must try to avoid calculating the difference of two 
close numbers. In such a case we should transform the corresponding 
expressions in an appropriate way in order to find the difference 
without actually carrying out such a subtraction: one must not try 
to determine the weight of one’s hat by weighing oneself first with 
the hat on and then without it! 

jWhen we deal with formulas containing differences of this kind 
which can noticeably affect the accuracy of calculations we should 
eliminate the differences by transforming the expressions. For 
example, calculating an expression of the form Q=a— Va — b 
(a > 0, b > 0) where b is several times smaller than a (and there- 
fore Va — Bx V a? = a) we can transform the expression in the 
following way: 


0 CVE et VB) bs 
a+ Va2—b2 a+ Vari—b2 


The last expression no longer contains the undesirable difference. 
10. Multiplication and Division of Approximate Numbers. General 
Remarks. Let us begin with an example. Suppose it is necessary to 
determine the area S of a rectangle with the sides a = 5.2 cm and 
b = 43.14 cm. It would be wrong to give the answer S = 5.2 
43.4 = 224.12 cm?. 
In fact, a is contained between 5.1 and 5.3, and b between 43.0 
and 43.2. Thus, the area is contained between 


Sı = 5.4 x 43.0 = 219.3 cm? and 
S = 5.3 X 43.2 = 228.96 cm? 


We see that all the decimal digits beginning with the second one in 
the above value of S may be incorrect and therefore they may only 
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lead to misconceptions. In this case the correct answer we must give 
iSS = 2.25¢810%icm-*. 

. By the way, we note that the calculation of S, and S, demonstrates 
the way that can be followed in estimating the results in other pro- 
blems. 

Thus we see that in multiplying two numbers with two and three 
correct decimal digits we must retain two decimal digits in the 
answer. The same rule holds for the general case of multiplication 
of approximate numbers and also for their division: the number 
of correct decimal digits in the result must be equal to the smallest 
of the numbers of correct decimal digits in the factors (or in the 
divident and the divisor in the case of division). The reason for 
this general rule is that, in the first place, the operations of multi- 
plication and division performed on approximate numbers yield 
the addition of the corresponding maximum relative errors (this 
will be shown in Sec. IX.11) and, in the second place, the number 
of correct decimal digits and the maximum relative error indicate 
similar qualities connected with the degree of relative accuracy. 

In the example of calculating S the maximum relative error of b 
is considerably smaller than that of a and therefore 5s = 6, + 
+ 8, ~ Ôa that is S has the same number of correct decimal digits 
as a. 

If the factors entering into a product are given with different 
numbers of correct decimal digits we must round the numbers before 
multiplying them and retain one reserve decimal digit which is 
discarded after the operation is performed. In case the factors have 
the same number of correct decimal digits but there are many factors 
(for instance, more than four) it is advisable to reduce the number 
of correct digits in the product by one. 

As an example, let us take the formula Q = 0.24J?Rt which is 
applied to calculating heat generated by an electric current. In 
this case the answer cannot have more than two correct decimal 
digits because the coefficient 0.24 has only two correct digits. There- 
fore there is no sense in taking J, R and ¢ with more than three correct. 
decimal digits (moreover, the third digit is taken only as a reserve 
digit). If a more accurate value of Q is desirable we must first of all 
specify the value of the coefficient. 

It should be noted that absolutely exact factors do not affect 
the choice of the number of correct decimal digits in a product. For 
instance, the coefficient 2 entering into the formula L = 27r of the 
circumference of a circle is absolutely exact (we can write it as 
2.0 or 2.00 etc.) and therefore the accuracy of calculations depends 
only on the number of correct decimal ‘digits to which m and r are 
computed. 

Let us take an example involving all the above rules. Let D = 
= 11.32 x 5.4 + 0.381 x 9.4 + 7.43 x 21.1. In order to estimate 
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the magnitude of the summands we calculate them rounding to one 
correct decimal digit. Thus we get 500, 3.6 and 140. Hence the 
sum is of the order of several hundreds. The factor 5.4 entering into 
the first summand (which is the largest one) has only two correct 
decimal digits and thus the whole result must have two correct 
digits. Now, according to the rule of a reserve decimal digit, we 
must calculate to within unity and then round off the result to the 
nearest ten. Thus we obtain D = 690 + 3 + 157 = 850, i.e. D = 
==! 8,050". 

Calculations with unnecessary digits are not only useless but 
even misleading because they may give the illusion of an accuracy 
greater than that we actually have. 

The choice of a degree of accuracy of approximate quantities for 
performing mathematical operations on them is made in accordance 
with a general principle which states that all the degrees of accuracy 
which we choose must be coherent to each other at every stage of 
our calculations. This means that none of the degrees must be too 
great or too low. 

We shall take an example to illustrate the principle. Suppose 
we have to calculate the area of a rectangle by the formula S = ab. 
Let a be measured or calculated to three correct decimal digits. 
Then we must take b also with three correct digits because the fourth 
decimal digit of b would be useless whereas if we determined 0 
only with two correct digits the efforts applied to finding the third 
digit of a would be futile. Therefore when we calculate a product 
it is convenient to take the factors (at least those factors which are 
difficult to determine) with the same number of correct decimal 
digits. Similarly, the summands entering into a sum must be taken 
with the same number of decimal digits to the right of the decimal 
point. . 3 

Here we give an example. Let the expression M = ab + cd 
be calculated and let it be known that a æ 30, b œ 6, cœ 0.1 
and d œ% 40. Suppose that a is taken with three correct decimal 
digits. What number of correct digits should be chosen for b, c 
and d? It is clear that we must take three correct decimal digits 
for b according to the accuracy of a. Further, we have ab ~ 180 and 
cd = 4. This implies that for calculating M with three correct 
digits (the accuracy of a makes it impossible to obtain M with 
more than three correct digits) it is sufficient to determine c and 
d with only one correct decimal digit. If it is not too difficult the 
accuracies of b, c and d should be increased by one decimal digit 
but the extra digit is only a reserve one. 

When performing practical calculations we often face a problem 
which is in some sense inverse to the above problem. The degree 
of accuracy of a desired result is sometimes set beforehand according 
to some prerequisites and then it is required to determine the necessary 
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degrees of accuracy of the quantities involved into the calculations 
(and the accuracy of the calculations). Some of the quantities may 
be obtained as a result of an experiment and therefore our discussion 
also applies to the determination of a desirable precision of an 
experiment. The solution of the inverse problem is based on the 
rules of approximate calculations we have studied here. For example, 
suppose we have to calculate the total surface area of a circular cy- 


linder by the formula S = x (DH + T). Let it be approxima- 


tely known that D ~ 20 cm and H ~ 2 cm. Then S ~ 700 cm? 
(check it up!). Now turning to the inverse problem and reasoning 
as in the preceding paragraph we see that if, for instance, we want 
to have the result with three correct decimal digits, i.e. with an 
accuracy of 1 cm?, then x and D should be taken with three correct 
decimal digits and H with two correct digits. Thus, measuring D 
and H we must attain the accuracy of 1 mm. It is better to calculate 
with a reserve decimal digit, and x should also be taken with a re- 
serve digit. But if we wanted to have more accurate values of D 
and H we would have to perfect our measuring instruments. 

The rules of determining degrees of accuracy for more complicated 
formulas will be given in Secs. IV.10 and IX.41. 


§ 3. Functions and Graphs 


44. Functional Relation. When investigating a phenomenon or 
a problem we often deal with several variables which are interrelated 
so that a change of one of the variables affects the values of the 
others. Then we say that there is a functional relation between the 
variables. For example, suppose a mass of a gas is kept under chan- 
ging conditions. Then there is a functional relation between the 
volume V, the temperature 7 and the pressure p of the gas because, 
as is well known from physics, the quantities are interrelated. We 
also have a functional relation between the area of a circle and its 
radius, between the distance passed over in a process of motion and 
the time taken and so forth. 

Usually it is possible to pick out certain variables from a number 
of interrelated quantities such that the values of the variables can 
be taken arbitrarily whereas the values of the other quantities are 
determined by the values of the variables entering into the first 
group. The variables of the first type are called independent vari- 
ables (or arguments) and the variables of the second type are called 
dependent variables (or functions). As an example let us consider 
the relationship between the area S of a circle and the length R 
of its radius. It is natural to regard R as an independent variable 
and choose its values arbitrarily; then the area computed by the 
formula S = wR? is a dependent variable in this functional relation. 
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In the example of the mass of gas we could have taken V and T 
as independent variables. Then the variable p (the pressure) would 
have been regarded as a dependent variable. 

A rule (a law) according to which to the values of independent variables 
there correspond the values of a dependent variable is called a function. 
Thus, every time there is a law of correspondence between the values 
of variables we say that there is a functional relation. The concept 
of a function is one of the most important mathematical notions. 

By the way, the term “function” is sometimes used in a different 
sense. As it has been mentioned, independent variables are called 
arguments and a dependent variable itself is called a function. 
Such a twofold sense of the term does not, however, lead to any 
misunderstandings. 

It should be noted that when we have a functional relation between 
variables the distinction between the independent variables and 
the dependent ones is sometimes conditional. For instance, in the 
example of the mass of gas we could have taken T and p as indepen- 
dent variables and V as a function. We can easily construct the 
scheme of an experiment in which 7 and p can be varied arbitrarily 
whereas V depends on T and p. Of course, the choice of variables 
which are regarded as independent quantities may be important 
in some cases. The choice should be made in a natural and conve- 
nient way in accordance with the circumstances. 

Functions may depend on one argument (as in the example of the 
area of a circle) or on two or more arguments. In the first two chapters 
of our course we shall consider (almost without exceptions) functions 
of one independent variable. 

We must note that when we regard a quantity y as a function 
of an independent variable z we do not necessarily suppose that 
there is a meaningful causal relationship between the variables. 
It is quite sufficient if there exists a rule which attributes a certain 
value of y to each z even if we do not know this law of correspon- 
dence. For example, the temperature 0 at a point in space can be 
regarded as a function of time ¢ since it is clear that we always have 
a certain temperature at the point at each moment t, that is to the 
values of ¢ there correspond the values of 0, although the variations 
of O cannot be simply accounted for by the changes of ¢ since in 
reality these variations are determined by some complicated physical 
laws. 

12. Notation. If a variable y is a function of a variable x we 
usually write y = f (x) (this is read as “y is equal to f of x”) where 
f is the sign of a function. If we make z assume certain particular 
(concrete) values the function will assume its particular values. 

For instance, let y = f (x) be of the form y = 2". Then y = 4 
for z = 2, y = 0.36 for z= —0.6 etc. This can be written as 
f (2) = 4, f (—0.6) = 0.36 and so on, or as Vie=s = 4 lee = 
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= 0.36 etc. The vertical lines in the last expressions are the signs 
of substitution which mean that we substitute the values of z for 
the argument, X 

The notation y = f (z) is used when the concrete expression of 
a function is too complicated or when we do not know the expression. 
It is also used for formulating general rules and properties of all 
functions or of many concrete functions. For example, the formula 
(a + b)? = aè? + 3a?b + 3ab? + b? which is well known from al- 
gebra is written in letters. Here the letters a and b are not concrete 
numbers but they can be replaced by any numbers. 

If we consider several functions simultaneously we can use, besi- 
des f, any other letters: F, p, © etc. We can also introduce diffe- 
rent subscripts, superscripts, and other indices: f,, fe, F$ ete. 
At the same time when we consider different problems we can 
denote different functions by the same letter f- We remind the 
reader that we have a similar situation in algebra: a letter, say 
the letter a, ‘may denote different quantities in different problems, 
but we must not denote by a different quantities entering into one 
and the same problem. On the other hand, different quantities 
may sometimes be connected by one and the same functional rela- 
tion. In such a case we can use one and the same letter f because 
f designates the law of dependence of one quantity upon another 
and is irrelevant to the way the quantities are denoted. For example, 
if y = 2°, z = uř and v = č? then we can write y = f (z), z = 9 (u) 
and v = f(z). In this case the sign f indicates raising to the third 
power whereas @ indicates the operation of raising to the fifth power. 

Functions of several variables are denoted similarly. For instance, 
let z = z? — 22”. Here z and y are independent variables and z 
is regarded as a dependent variable. We can write z = f (z, y) where 
the comma is a sign indicating the dependence of the function on 
two arguments. In this case particular values are found in the fol- 
lowing way: 
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and so on. 

One must get used to this notation and operate on it quite freely. 
Here we give several examples of such operations. Suppose we have 
the functions y = f (z) = z? — 3z and z = @ (x) = 2r +1, and 
let a be a constant number. Then 

f (a) = a? — 3a (this is the value of the first function for z = a); 

@ (a) = 2a? + 1 (this is the value of the second function for z = a’); 

f (2?) = (@?)* — 322 = qt — 3z? [this is the value of y assumed 
when z? is substituted for the argument; we thus obtain a new func- 
tion of z which may be denoted as F (z)]; 
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[f (a)? = (x? — 32)? = xt — 62° + 9x? (this is one more func- 
tion of 2); 

p (z + a) = 2 (z + a) + 1 = 2z + 2a 4 1 (this is another new 
function of 2); 

f (2) @ (2) = (2? — 3x) (2x + 1) = 2a — 52? — 3z; 
f (@ (a) = lp (z)? — 39 (2) = (2r + 1)? — 3 (2s + 1) = 
= 47? — 2x — 2; 

p (f (x)) = 2f (2) +1 = 2 (z? — 3x) +1 = 2x? — ôx + 1; 

f @ F s) = (+9 — 3 (£+ s) = T? 4 das Fs — 3r — 3s 
[this is a function of two variables which can be denoted by © (x, s)] 
etc. 

In particular, in the above examples we come across the operation 
of composing “a function of a function” or, as it is usually said, 
we deal with a composite function. A composite function is usually 
obtained in the following way. Let a variable y depend on a variable 
u and let u, in its turn, depend on a variable z. Thus, y = f (u) 
and u = ọ (x). Then variations in z change u and therefore y also 
varies. Hence, y is a function of x of the form y = f (@ (z)). Thus 
we obtain a composite function. In this case u is an intermediate 
variable. There may also be several intermediate variables. 

If we only want to designate that y is a function of x avoiding all 
the intermediate operations we can write y = y (x). For instance, 
in the examples in Sec. 14 we could have written S = S (R) or 
p =p(V, T) and V = V (T, p). 

13. Methods of Representing Functions. If we intend to investi- 
gate a function, that is a dependence of one quantity upon another, 
the function must be represented in a certain way. There are several 
methods of representing functions. 

The analytical method (i.e. representing a function by a formula) 
is one of the most widely used methods in mathematics. This method 
describes the mathematical operations which should be performed 
on the independent variable to obtain the value of the function. The 
operations are indicated by a formula. For example, the formula 
y = 2? — 2z says that in order to compute the value of the function 
y we must raise the corresponding value of the argument to the second 
power and then subtract the doubled value of the argument from 
the result, 

The analytical method is compact (i.e. formulas usually occupy 
little space), it can be easily reproduced (i.e. it is not difficult to 
rewrite a formula). Besides, it is the most suitable method for per- 
forming mathematical operations on functions. Here we mean the 
algebraic operations (addition, multiplication and so on), the ope- 
rations of higher mathematics (differentiation, integration and the 
like), and others. But the method is not visual enough (this means 
that when we have a formula it is not always possible to visualize 
the character of dependence of the function upon its argument). 
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The calculation of particular values of a function represented by 
a formula (in case the values are needed) may be a complicated ope- 
ration. In addition, not all functions can be represented by a formu- 
la, and it may be inconvenient to put down a formula even when 
it exists. 

It is sometimes necessary to use several different formulas to 
represent a function on different parts of the range of its argument, 
For example, let a material point fall without an initial velocity 


Position of the 
point at moment t=0 


s=0 


Position of ,\ Uthe point at moment t 
Position of | \the platform at moment t=0 


Position of the \platform at moment t \vt 


s 
Fig. 6 


on a platform which is moving downwards uniformly with the velo- 
city v, the distance between the point and the platform at the mo- 
ment ¢ = 0 being equal to h. Then the path s covered by the point 
is a function of the time ż, i.e. s = f (t). According to Fig. 6 the 
relationship is determined by the formula 


joule Ostse 


(apot (#<t<o) 


where ¢* is the moment of the impact of the point against the plat- 
2 
form which can be found from the eguation S =h + vt. 
The tabular representation of a function gives the numerical values 
of the function for certain discrete values of the argument. Such 


a table may have the following form: 


y =f (2) 


yi=f (z) | vo= f(t) | y3=f( 


Each of the differences z, — £1, £g — Ta, ... is called a step of the 
table. The tables with a constant step are the most convenient; an 
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argument x entering into such a table is taken for the values a, 
a+h,a-+ 2h, .... Here the constant step is denoted by hk. The 
well-known tables of logarithms, of trigonometric functions etc. 
are examples of tabular representation of functions. In these tables, 
to save space, the values of the functions are written in lines (like 
words in a book) but not in a single row [as in (2)]. There are many 
other different tables of various important functions. Some tables 
represent a result of an experiment in which the values of one of 
the quantities are set beforehand and the values of the other quan- 
tities are measured etc. 

The great advantage of the tabular method is that the values of 
a function are already calculated and therefore they can be imme- 
diately utilized. But we sometimes need the values of a function 
represented by a table which correspond to the values of the argument 
that are not included into the table. Then we have to perform some 
additional operations, namely, the operation of interpolation (i.e. 
calculating the values of the function for intermediate values of 
the argument) or the operation of extrapolation (i.e. calculating the 
values of the function for the values of the argument that fall out- 
side the table). These additional calculations often yield incorrect 
results. Tables sometimes occupy much space. Building up a table 
usually requires much work. But in recent years the development 
of modern computer techniques has made it possible to calculate 
tables more quickly. 

The disadvantage of the method is that it is inconvenient for 
performing mathematical operations because each new operation 

can make it necessary to compile a new table which is hard work. 
Besides, this cannot sometimes be done with a desired accuracy. 

The third basic method of representing functions is the graphical 
method. The method represents a function by means of constructing 
its graph which enables us to visualize the character of variations 
of the function. Besides, when we have the graph of a function we 
can easily find approximate values of the function accurate to one 
or two decimal places but, of course, only in a given range of the 
argument. The construction of a more accurate graph requires much 
effort and yet the accuracy of the values of the function obtained 
by means of the graph may not be sufficient. It should be noted 
that some graphs characterizing an experiment may be drawn by 
a self-recording apparatus. 

The fourth method of representing functions has been widely 
spread in recent years and is now becoming one of the most impor- 
tant methods. This is the method of compiling a program for calcu- 
lating the values of a given function with the help of an electronic 
computer. We shall discuss the method in Sec. XIX.6. 

All these methods in a certain sense supplement one another. We 
often come across the problem of passing from one method to another, 
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that is the problem of constructing a graph, of compiling a table 
(the so-called tabulation) or the problem of finding a suitable for- 
mula. Later on we shall discuss the problems of this kind. 

There are, of course, many other ways of representing functions. 
We can sometimes give a verbal explanation of the law of corres- 
pondence between an independent variable and a function. For 
instance, we can say that a tax is such-and-such function of one’s 
income. 

For the first time the definition of the concept of a function close 
to the modern definition was given by the Swiss mathematician 
Johann Bernoulli (1667-1748) in 1748. But in the 18th century 
functions were usually understood in 
the sense of an analytical formula. 
The modern general concept of a fun- 
ction based on the notion of a law of 
interdependence between variables 
was first introduced by Euler in 
1755 but it became universally recog- 
nized only in the 19th century. 

14. Graphs of Functions. Graphs 
serve for the geometrical representa- 
tion of functions. We shall remind the 5 
reader of the techniques of constru- Fig. 7 
cting graphs which are known from 
elementary mathematical courses. Let a variable y be a function 
of a variable x, i.e. y = f (x). In order to construct the graph of 
the function we choose two number scales (axes) lying in a plane. 
The z-axis is usually drawn from left to right and is called the 
axis of abscissas and the y-axis is perpendicular to the z-axis and 
is called the axis of ordinates. The origin from which the coordi- 
nates are reckoned is often chosen at the intersection point of the 
axes (see Fig. 7). Then we make the argument take on different 
values and find the corresponding values of y = f (x) which enables 
us to construct the points of the graph. 

The point M shown in Fig. 7 is an arbitrary “moving” (variable) 
point of the graph and it has current coordinates z, y. Practically 
we cannot construct a very large number of points. We then connect 
the points with a curve and thus receive an approximate represen- 
tation of the graph. But theoretically we interpret such a construc- 
tion as if the variable z ran through its whole range; then the moving 
point M should run along the whole graph. Fig. 7 represents an 
example of a graph. We see that in this case the value of the function 
first increases as the argument x increases; such an increase lasts 
until z approaches, approximately, the value z = 0.5; then the 
function begins to decrease (comparatively slowly) and beginning 
with z = 2 the function increases again with an increasing rate. 
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The units of length and the origins on each of the axes should be 
chosen in such a way that all the most interesting peculiarities of 
the behaviour of the function on the corresponding intervals of the 
ranges of the argument and of the function should be represented 
most clearly. 

For example, let us consider the graph of a function which des- 
cribes a uniformly accelerated motion. Let the law of motion be 


s = 98 +0.01 (¢> 0) (3) 


where ż is measured in seconds and s is measured in centimetres. In 
this case we can choose the number scales on the axes in the way 
shown in Fig. 8. It is clear that a change of the position of the origin 
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Fig. 8 Fig. 9 


on the axis of the argument or on the axis of the function results, 
respectively, in the parallel translation of the graph as a whole 
along the z-axis or along the y-axis. If we increase or decrease the | 
unit of length along one of the axes then the graph will, respectively, 
expand or contract along the same direction in such a way that 
the distances from all the points of the graph to the other axis will 
increase or decrease the same number of times. For instance, Fig. 9 
represents the form of the graph of the same function (3) after the 
scale for the t-axis has been changed. The new graph is obtained from 
the original one by expansion in the direction parallel to the t-axis. 

In order to represent the behaviour of a function in the best way 
ane banao uses non-uniform scales which were already mentioned 
in Sec. 4. 

In what follows we shall regard variables (both arguments and 
functions) as dimensionless unless the contrary is explicitly stated. 
In theoretical investigations the simplest thing is to take equal 
units of length for both axes and reckon the coordinates from the 
intersection point of the axes (which is then called the origin of 
the coordinate system) and we shall follow this way. We have already 
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described above in what way changes of scales or of the position 
of the origin affect the form of a graph. 

15. The Domain of Definition of a Function. The domain of defi- 
nition of a function is the totality of all the values of the indepen» 
dent variable for which the function is defined, that is the admissible 
range of the independent variable (see Sec. 5). We usually consider 
continuous variables and in such cases, as it was pointed out in 
Sec. 5, the domain of definition consists of one or several intervals. 

The structure of the domain of definition of a function is sometimes 
implied by a physical or geometrical meaning of the function. For 
example, if we consider the relation S=2R? describing the depen- 
dence of the area of a circle upon its radius the domain of definition 
of the function is 0< R < œ since the geometrical meaning of 
R implies these very values. In case we consider the dependence of 
the atmosphere density p at a point (lying above a given point of 
the earth’s surface at the height h above sea level) the domain of 
definition of the function p = p (h) is the interval ho < k < H 
where hy is the height of the earth’s surface and H is a conditional 
height which is regarded as the limit of the atmosphere. If a function 
is represented by a formula then the set of all the values of the 
argument for which this formula gives a certain real value of the 
function is regarded as its domain of definition (as long as we con- 
sider real functions of a real argument, that is functions for which 
both the independent variable and the dependent one assume only 
real values). For instance, if y = z’ then z can take on any real 
values, i.e. the domain of definition is the whole number line 
—o < z< oo. If y = V z*—2 then we cannot obtain real values 
of y while extracting the root in case we have z*—2 < 0. Consequently, 
there must be z*—2 > 0, i.e. z? > 2. The last inequality is fulfilled 
for z < —V2 or x >Y 2; the domain of definition consists of two 
intervals —o < z < — V2 and V2<2<oo (the domain is 
shaded in Fig. 10). In other analogous cases in order to determine 
the domain of definition of a function we must first find out what 
may prevent us from getting real values of the function and then 
form inequalities (as it was done in the last example when we put 
down z* —2> 0) which guarantee the possibility of obtaining 
real values, Then the problem of determining the domain of defi- 
nition is reduced to solving these inequalities. 

If an independent variable is discrete the domain of definition 
of the corresponding function consists of discrete (separate) points. 
For instance, if f (£) = z! =4 x 2x ... a then z can assume 
only the values 1, 2, 3, ... . In case a discrete argument takes on 
only integer values, as in the above example, it is usually denoted 
not by x but by the letters n, m, k and the like whereas the values 
f(1), (2), ... f(m), .-. are denoted as a, dy, ..., inat 
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In such a case we say that there is a sequence; for example, 
a geometric progression of the form 


Ca Oe Agee Oy oy | Gy — 0g”) ws 


is an example of a sequence etc. The graph of a function of a discrete 
argument is not a continuous line but consists of discrete points 
(see Fig. 11). 

The range of a dependent variable, that is the set of all the values 
assumed by a function as its argument runs over the domain of 
definition of the function, is called 
the range of the function. For 
example, the domain of definition 
of the function y = z? is the inter- 
val —oo < x < œ and the range 
of this function is the interval 
0<y<o since in this case y 
assumes only non-negative values. 
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The determination of the domain of definition of a function is 
essential for constructing its graph because this domain is just 
a part of the axis of abscissas over or under which the graph is placed. 
We see three simple graphs depicted in Fig. 12; the domains of the 


r 


Fig. 12 


functions are shaded. It is clear that in case a domain of definition 
consists of several separate parts the corresponding graph also 
consists of several components. 

16. Characteristics of Behaviour of Functions. Now our aim is to 
study the ways of describing characteristic features of the behaviour 
of functions. 

Unless otherwise stated, we shall regard functions in question 
as single-valued, that is we shall suppose that to each value of an 
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independent variable taken from the domain of definition of a func- 
tion there corresponds only one certain value of the function. Mul- 
tiple-valued functions will be discussed in Sec. 20, 

A funetion is called increasing (decreasing) if the values of the 
function increase (decrease) as the argument increases. Both in- 
creasing and decreasing functions are called monotonic. When a func- 
tion is not monotonic it is usually possible to indicate the intervals 
of monotonicity of the function on the axis of the argument; the func- 
tion is monotonic on each of such intervals. Between the intervals 
of monotonicity there are often the intervals of constancy of the 
function. For example, in Fig. 13 we see the graphs of an increasing 
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function f (z), of a decreasing function @ (x) and of a non-monotonic 
function wp (x); the function tp (zx) has the interval of increase —co cme 
< z <a, the interval of decrease a < z <b, the interval of con- 
stancy b< x< c and the interval of increase c Si 100) 

The condition that a function f (z) increases can be written in 
the following way: 2, < Tą always implies f (zx) < f (2): This 
enables us to perform similar operations on both sides of inequalities 
involving the argument and the values of an increasing function: 
for example, as we know that y = + isan increasing function we 
see that an inequality of the form a < b implies a? < $3 and vice 
versa. In case a function f (z) is not monotonic the operation of this 
type can be performed on every interval of monotonicity in which 
the function increases. If a function decreases on some interval then 
Tı < T, implies f (z1) > f (£). For instance, the function y = gz? 
decreases over the interval —oo < z <0 and increases for 0 <= 
< z< œ; hence, a < b implies a? > b? in case b <0 and a? < b? 
if a> 0. 

A function is called continuous if a continuous (“gradual”) change 
of the argument results in a continuous change of the values of the 
function (without “jumps”). If otherwise the function is called dis- 
continuous and every value of the argument for which the continuity 
(“gradualness”) of the change of the function does not take place 
is called the point of discontinuity of the function. (These notions 
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will be discussed in detail in § III.4.) For example (see Fig. 14), ~ 
the function y = z? is continuous over the whole z-axis; the function p 


y= + has one point of discontinuity x = 0 (the values of the func- < 
tion “approach infinity” as the values of the argument approach the 


Fig. 14 


value z = 0) and is continuous for all other values of z Æ 0; the 
function y = tanz has an infinitude of points of discontinuity 


r= + tntintsm.... 
If a function is defined on both sides of its point of discontinuity 


the graph of the function is also discontinuous and consists of two 
or more pieces (components; see, for example, Fig. 14). 
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y=sin x 


Fig. 15 


The function y'= sinz is an example of a periodic function. — 
Namely (see Fig. 15), the behaviour of the function on the intervals ` 
Le n<axcm—tn, —2n<x<0, 

Os con, 5 re Rye wiles 
is similar. More precisely, sin (x + 2m) = sin z. Here the sign = 
is the sign of identity. We write it when we want to underline that 
an equality is an identity (it is also permissible to put down the — 
usual sign of equality in such a case). 
The number 2x is called the period of the function y = sin z. | 
We also have the identities > j 
sin (z + 4n) = sin z, sin (z+ ôn) = sin z, 
sin [z + (—2n)] = sinz etc. 


VARIABLES AND FUNCTIONS 5i 


But the period is usually understood as a positive number and even 
the least number for which the identities hold. Therefore it is 2x 
that is the period of y = sin z but not 4x, 6x etc. By the way, in 
order to indicate the fact we sometimes speak about the primitive 
period. 

Generally, a function y = f (z) is called a periodic function with 
period A > 0 if there is an identity of the form f (x + A) = Í (2). 
Such a function behaves in the same way on each of the intervals 


+» @—2ACrca—A, a—Ace<a, 
axztaa+A, a+Acrca+ 2a, 


where a is an arbitrary number. Therefore in order to investigate 
the function it is sufficient to consider its behaviour on one of the 


Fig. 16 


intervals (see Fig. 16). The equality f (z + A) = f (x) is illustrated 
in Fig. 16 for one of the values of z. 

A function y = f (z) is called an even function if it does not change 
its value when the sign .of the argument is changed, that is if 
f (—x) = f (z). The examples of even functions are y = 2%, y = zô, 
y = cos z etc. Fig. 17 shows that the graph of an even function is 
symmetric with respect to the axis of ordinates. A function Í (x) 
is called odd in case it is multiplied by —1 when the sign of the 
argument is changed, that is f (—z) = —f (zx). The examples of. 
odd functions are y = z, y = x, y = sinz etc. Fig. 18 illustrates 
the fact that the graph of an odd function is symmetric with respect 
to the origin of the coordinate system. It should be noted that in 
the general case a function may be neither even nor odd; for example, 
this is the case with the functions y = 1 + sin z, y=1-—ay = 25, 
y = log z etc. 

17. Algebraic Classification of Functions. Functions represented 
by a single formula (see Sec. 13) are classified depending on the 
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necessary algebraic operations which should be performed on the 
values of the argument in order to obtain the values of the functions. 
If only the operations of addition, subtraction and multiplication 
are used, and also the operation of raising to a positive integral 
power which is a special case of multiplication, the function is 
called a polynomial or an entire (integral) rational function; in 


Fig. 17 Fig. 18 


forming a polynomial arbitrary constant coefficients can be used. 
Examples of polynomials are y = 2° — I2+3, y= so", y=3, 


y= —=+ V 223, y = atz? — 2 and the like. On the other hand, 


the functions y = z and y = 2° + 2 Vz are not polynomials 
in the sense of the above definition. Every polynomial is characte- 
rized by its degree which is the highest of the exponents of powers 
of the independent variable entering into the expression of a poly- 
nomial; for instance, the degrees of the above written polynomials 
are, respectively, 3, 2, 0, 3, and 2 

Rational functions form a wider class of functions: these are the 
functions which involve the additional operation of division. If 
a rational function is not an entire function it is called a fractional 
rational function. An example of such a function is 
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According to the rules of elementary algebra every rational function 
can be represented as a ratio of two polynomials after all the summands 
entering into the expression of the function are reduced to a common 
denominator. 

There is a still wider class of functions whose analytical expres- 
sions may involve an additional operation of extracting roots. This 
is the class of algebraic functions. If an algebraic function is not 
rational it is called irrational. An example of an irrational function 


is y= 44 Vri. 


mie 
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Functions which are not algebraic are called transcendental. 
Examples of transcendental functions are y = sin z, y = z? + 
+ tan z, y = 2*, y = log z etc. We point out that the last two 
functions are transcendental despite the fact that they are some- 
times traditionally considered in elementary courses on algebra. 

All these definitions are automatically extended to functions of 
several independent variables. The only new fact is the definition 
of the degree of a polynomial in several variables: it is defined as 
the greatest of the sums of the exponents of arguments entering 
into the monomials which are the summands in the expression of 
the polynomial. 

For instance, the function f (z, y) = zty — vty? + x is a poly- 
nomial of the sixth degree in z and y. But if we regard y as fixed 
the same function will be a polynomial of the fourth degree in z. 

A polynomial of the first degree and a polynomial of the second 
degree are called, respectively, a linear function and a quadratic 
function. A polynomial of the third degree is called a cubic function 
and so on. These terms are applied for any number. of independent 
variables. 

18. Elementary Functions. We first enumerate the basic elemen- 
tary functions studied in elementary mathematical courses: 

y = x° (where a is constant) is a power function; 

y = a" (where a is constant) is an exponential function; 

y loga z (where a is constant) is a logarithmic function; 

y =sinz, y=cosz, y=tanz and y = cot z are trigonome- 
tric functions (circular functions); 

y = arc sin z, y = arc cos x etc. are inverse trigonometric func- 
tions (inverse circular functions). 

Elementary functions are all the functions which can be obtained 
from basic elementary functions by means of algebraic operations 
(with any numerical coefficients) and the operation of composing 
a function of a function (see Sec. 12). In view of this definition all 
the algebraic functions are elementary. But very many transcendental 
functions are also elementary, for example, the functions y = z + 
-+ log sin z, y = 2log tan x+sin® etc, (one may come across very 
complicated expressions of this type). 

The class of elementary functions includes the greater part of 
functions treated in general courses of higher mathematics. As an 
example of a function which is nòt elementary we can mention 
y = z<! (but we do not give now the general definition of the function 
involving non-integral values of the argument. On this question 
see Sec. XIV.17). Many non-elementary functions are widely used 
in special branches of mathematics and its applications. Many of 
these functions have been investigated in detail and therefore the 
traditional classification into elementary and non-elementary func- 
tions may now be considered out of date. 
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19. Transforming Graphs. It often happens that we know the 
graph of a function and it is necessary to construct the graphs of 
some other functions which can be expressed in a certain way in 
terms of the former graph. Here we give several examples of trans- 
forming graphs in this manner. 

Let the graph of a function y = f (x) be given. It is required to 
construct the graphs of the functions z = f (x) + a and u = f (x+0) 
where a and b are some constants. The values of the quantities z 
and u will be represented on the same axis of ordinates as that 
for y (see Fig. 19). Then we have z = y + a for any x and therefore 


u 


Fig. 19 ` Fig. 20 


the graph of the function z (z) can be obtained from the graph of 
the function y (z) by translating the latter along the y-axis by the 
distance a in the positive direction of the axis in case a >0. Such 
a translation is depicted in Fig. 19 where each of the vertical line 
segments has the length a. As for the graph of the function u (z), 
one may think that it is obtained from the graph of y (z) by transla- 
ting the latter along the z-axis by the distance b in the positive 
direction in case b >0. But this conclusion is wrong; in fact we 
should displace the graph of y (z) by the amount b in the negative 
direction (if b >0) in order to obtain the graph of u (x). Indeed, 
the value wu = y is obtained if we take the value of the argument for 
u which is smaller by b than the corresponding value of the argument 
for y since u = f [(x — b) + bl =f (z) = y. Of course, if a < 0 
or b <0 the corresponding translation should be carried out in 
the opposite direction. On the other hand, when we say, for example, 
“to displace upwards by (—3)” we mean, in fact, “to displace down- 
wards by (+3)” etc. Therefore it is permissible to say that we trans- 
late a graph in a certain direction by an amount k no matter what 
the sign of h is. 
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The graphs of the functions v = kf (x) and w = f (kz) are construc- 
ted in a similar way (see Fig. 20). The graph of the function v (x) 
is obtained from the graph of y (x) by the uniform k-fold expansion 
of the latter in the direction of the y-axis so that all the distances 
from the points of the graph of y (x) to the z-axis should increase 
k times (in case k > 1). Indeed, the points of the graph of v (x) which 
have the same abscissas as the points of the graph of y (z) have the 
ordinates & times the ordinates of y (x). The graph of the function 
w (x) is obtained from the graph of the function y (x) by the uniform 
contraction of the latter toward the y-axis with the k-fold decrease 
(in case k >1) of all the distances of the points of the graph of 


y (z) to the y-axis. Actually, we have w (3) = f ( k =) = f.) = 


= y (x). Of course, what we have said is literally true if k >1. 
In case 0 < k < 1 the expansion is replaced by the contraction and 


vice versa. But again, when we say, for example, “the (+-)-fola 


expansion” we actually mean “the 3-fold contraction” and the like. 
Therefore we can say that we perform a k-fold expansion (or contrac- 
tion) without specifying the magnitude of k. In conclusion we re- 
mark that if k <0 we should additionally apply the operation of 
forming the corresponding mirror images of the graphs of v (z) 
and w (x) (for the function v (x) the mirror image must be taken 
about the z-axis and for the function w (z) the mirror image must 
be taken with respect to the y-axis). 

Now combining the above results we can say that the graph of 
the function y = kf (mz + b) + a can be obtained from the graph 
of the function y = f (z) by means of the following transformations 
(performed in succession): the parallel translation along the z-axis 
[which yields the graph of y = f (x + b)l, the contraction [which 
results in the graph of y = f (mz + b)l, the expansion [which gives 
the graph of y = kf (mz + b)] and one more final translation along 
the y-axis resulting in the desired graph of y = kf (mz + b) + a. 
(If necessary, the corresponding mirror images should also be taken.) 

The same results can be obtained by the corresponding operations 
on the coordinate axes without changing the graph. For example, 
instead of displacing the graph to the right we can translate the 
axes to the left, or, in other words, displace the origin (from which 
x is reckoned) to the left. Similarly, instead of expanding the graph 
and increasing the distances from the z-axis k times we can decrease 
the corresponding unit of length for the y-axis k times. 

We can perform arithmetical operations on functions represented 
graphically. For example, Fig. 21 illustrates the graphical addition 
of two functions: the graphs of f (x) and ọ (x) are given and it is 
necessary to construct the graph of their sum. The equal line seg- 
ments lying one above another are shown in heavy lines. 
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In conclusion we demonstrate (see Fig. 22) the graphical construc- 


tion of the composite function z = ọ [f (x)] when the graphs of each ` 


of the functions z = ọ (y) and y = f (x) are given. It is convenient 
to place the given graphs in the manner shown in the figure. Then 


y 


Fig. 24 


if we take a certain value z and draw the line segment AB equal 
to the line segment A'B’ the point B will lie on the graph of the 
composite function; therefore the point B describes the sought-for 
graph when z runs over the corresponding range. 


The graph is given) 


z=ø[f(x)] 


Th h is to b 
y» (The grant fe ba 


Fig. 22 


20. Implicit Functions. An implicit function is a function which 
is defined by an unsolved equation connecting the argument and 
the function. Solving this equation for the function we receive the 
same function but represented explicitly. For instance, the equalities 
z— y? +2 = 0 andy = yu + 2 are equivalent; they define the 
same function y = y (x) but the former relation represents the 
function implicitly, as an implicit function, whereas the latter 
represents the function explicitly. But it often happens that it is 
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practically impossible to solve an equation for a function and it. 
is sometimes inexpedient though possible. In such cases we retain: 
the equation in its original unsolved form. In the general form 
such an equation can be written as 


F(z, y) =0 (4) 


(after the right-hand terms have been transposed to the left-hand 
side). One must not regard such an equation as inconvenient or 
difficult to deal with. Later on we shall present some methods which 
make it possible to investigate functions represented implicitly. 

If a value of z is given then in order to determine the correspon- 
ding value y of an implicit function y (x) defined by an equation. 
of form (4) we must solve the equation. As is well known, the sub- 
stitution of a solution of an equation into the equation turns the: 
equation into an identity. Therefore we can say that an implicit. 
function y = y (z) defined by an equation of the form (4) is a func- 
tion which turns the equation into an identity when it is substituted 
into equation (4) (let the reader verify that this definition holds. 
for the above example). 

Equation (4) may have more than one solution for a given z. Then 
the function y (x) is multiple-valued, that is the function can take 
on more than one value for the given value of the argument. For 
example, taking the implicit function determined by the equation 


z—y=0 (5) 
we obtain, for any given value z >0, two values of y: y =V «x 
and y = —/z; the value of the radical itself is usually regarded 


as positive, in the arithmetical sense. It is difficult to investigate 
a multiple-valued function and therefore one usually tries to avoid 
investigating such a function in a direct way. In such a case it, is. 
convenient to separate the function into single-valued branches. 
corresponding to some chosen values of the function. For instance, 
in our previous example the two-valued function y = + z defined 
by equation (5) has two single-valued branches, namely, (y); = Va 
and (y): = —Vz. 

Each branch of a multiple-valued function is a single-valued func- 
tion and therefore its graph is an ordinary one. All such branches 
usually form an entire curve (exceptions to this rule will be discussed 
in Sec. II.8) which should be regarded as the graph of the function 
defined by equation (4). For instance, in our example, equation (5) 
can be rewritten in the form z = y? and this implies that the graph 
is an ordinary parabola. The only distinction from the “standard” 
equation of a parabola of the form y = z? considered in elementary 
mathematical courses lies in the fact that here the “standard” roles 
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of the axes z and y interchanged (see Fig. 23). Each of the single- 
valued branches is represented by the corresponding half of the 
parabola: the first branch is represented by the upper half and the 
second one by the lower half. 

The graph of an implicit function may have the form shown in 
Fig. 24. Here we see that the function is single-valuedforz <a 
and for z > b whereas it is three-valued for a < x < b. When separa- 
ting the function into branches it is natural to regard the arc AB 
as the graph of the first branch, the arc BC as the graph of the second 
branch and the are CD as that of the third one. 


Fig. 23 Fig. 24 


In connection with the question of multiple-valued functions 
discussed above we can note that for some functions to every value 
of the independent variable there corresponds an entire interval 
of values of the function. For instance, the relation between the 
height of a person and his possible weight is an example of such 
a functional relation. Functions of this kind are usually investigated 
in the theory of probabilities (see Sec. XVIII.16) and they will not 
occur in other chapters of our course. 

‘21. Inverse Functions. Suppose we are given a function 


y =f (2) (6) 


‘Let us now take different values of y and find the corresponding 
values of x, that is let us choose the former dependent variable as 
an argument and regard the former independent variable as a func- 
tion. The function (functional relation) z (y) thus obtained is called 
the inverse function of the original function y (x). It is represented 
by the same equality (6) in which y is now regarded as an independent 
variable whereas z is regarded as a dependent variable. But we 
have already pointed out that it is permissible to use different 
letters for denoting variables in considering one and the same 
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function. Therefore, if we wanted to denote, as we usually did, the 
independent variable by z and the dependent variable by y for the 
inverse function we should simply have to substitute z for y and 
y for z in (6). Hence, using the new notation we rewrite the relation 
which defines the inverse function in the form 


z = f (y) (7) 


Thus, the inverse function turns out to be represented in an implicit 
form and therefore (see Sec. 20) it is, generally speaking, multiple- 
valued. We can easily establish a condition which guarantees the 


Fig. 25 Fig. 26 


single-valuedness of an inverse function: this is the monotonicity 
of the original function. Indeed, this being so, we obtain a certain 
uniquely defined value z = z (y) for each given value of y (see 
Fig. 25). ; 

Examples. The inverse function of the function y = 2° is defined 
by the equality z = y*, that is =a the inverse function of y = z? 
is the two-valued function y = + V 2. ; 

Equalities (6) and (7) differ only in the interchange of the notation 
of the quantities z and y, that is in the interchange of their roles. 
Therefore we see, as it is shown in Fig. 26, that the graph of the 
inverse function is obtained as a mirror image of the graph of the 
original function about the bisector of the angle between the coor- 
dinate axes (the bisector is represented by the dotted line in Fig. 26). 
The points M and M’ in Fig. 26 both correspond to one and the same 
equality of the form b = f (a). : A 

We remark in conclusion that if the function æ (y) is the inverse 
function of the function y (z) then, conversely, the latter is the 
inverse function of the former. 
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§ 4. Review of Basic Functions 


Many of the functions which we are going to discuss here are stu» 
died in elementary mathematical courses. We shall consider them 
here because of their significance. 

22. Linear Function. The general form of a linear function (see 
the end of Sec. 17) is 


y=ax+b (8) 


where a and b are constant coefficients. 

The graph of a linear function is a straight line (see Fig. 27). The 
coefficient a is called the slope of the straight line. The greater | a l 
(i.e. the greater a in its absolute value), the steeper the slope of the 
straight line (with regard to the z-axis). If the argument of a func- 
tion changes from a value xo to a certain value z it receives an in- 
crement Az (which is equal to z — x»)*. Then the function recei- 
ves the corresponding | increment 
Ay [which is equal to y— yo = 
= f (x) — f (xọ)]. In our case 
y =f (z) =ar +b and therefore 
Yo = ato F b and y =ar + b. 
Consequently, y — yo = a (x — 2»), 
i.e. Ay =a Az. This implies 


Sua (if Ae~0) (9) 


y Thus, the ratio of the increment 

Fig. 27 of a linear function to the incre- 

ment of the argument is constant 

and equal to the slope of the graph. The increment of a linear 
function is directly proportional to the increment of the argument. 

In Fig. 27 the case when a >0 is shown. If a < 0 the straight 
line is drawn downwards to the right (see Fig. 28). In case a = 0: 
the straight line is parallel to the z-axis; in this case the function 
is constant and thus we obtain the graph of a constant. 

The property of the increment of a linear function forms the basis. 
for the so-called linear interpolation which is used even in elemen- 
tary mathematical courses. The idea of this method is the following. 
Suppose we know the values of a function y = f (x) (its graph is 
depicted in Fig. 29 by the dotted line) for x = z and forz = zy +h: 


Í (£0) = yo, f (xo +4) =y 


* The Greek letter A (delta) is used to denote an increment. The symbol Ar 
should be regarded as an indivisible symbol and by no means as the product 
of A by z. An “increment” is understood in the algebraic sense, i.e. it can be 
positive, negative or equal to zero. ; 
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but the intermediate values of the function for the values of z lying 
between « = zo and x = xo + k are unknown. Then we approxi- 
mately replace the given function by a linear function which assu- 
mes the same values for z = x» and for z = x) + h, that is we 
replace the arc U AE by the straight line segment AZ. The simila- 
rity of the triangles ABC and ADE then implies 


¥Y—Yo _ Yi-—Vo 
r—x h 


¥=Yot (x — Xp) 


Yı — Yo 
h 
Such a replacement is possible in case the function f (x) slightly 

differs from the linear function on the interval between zọ and 

zo + hk. The interpolation method is widely used, in particular, 


Y 


Fig. 28 Fig. 29 


for tables with a sufficiently small step when the successive values 
of the function differ slightly from each other. More precise methods 
of interpolation will be discussed in Secs. V.6-8. The linear extra- 
polation (see Sec. 13) is performed in like manner. 

Formula (9) and Fig. 27 imply that a = tan ọ, i.e. the slope of 
a siraight line is equal to the tangent of the angle of inclination of the 
line to the axis of abscissas. 

If the quantities x and y have certain dimensions the slope also 
has a dimension. Formula (8) shows that [b] = [y] and [az] = [yl 
which implies [a] = wh (the dimensions of coefficients entering 
into other formulas can be determined similarly). The geometrical 
meaning of the slope can be easily interpreted in the general case: 
if 1, units of length for the z-axis correspond to the unit measure 
of the quantity z and ly units of length for the y-axis correspond to 
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the unit measure of the quantity y (J, and ly are the so-called scale 
factors) then the sides AB and BC of the triangle ABC in Fig. 27 
are of lengths 1, Az and ly Ay, respectively. Consequently, 


ly Ay 
z and a= 


r Ay _ lx 
tan @ = 1, Az ‘aos. ty oe (10) 


i.e. the slope is proportional to the tangent of the angle ọ. 
23. Quadratic Function. The general form of a quadratic func- 


tion is 
y =ar + br+ec 


From elementary mathematical courses it is known that the graph. 
of a quadratic function is a parabola. In the simplest case when a = 4; 
b = 0 and c = 0, i.e. y = 2, the graph has the form depicted in 
Fig. 30. Then the function is even and the y-axis is the symmetry 
axis of the graph (the axis of the parabola). The intersection point 


y 


Fig. 30 Fig. 34 


of a parabola with its axis is called the vertex of the parabola. The 
vertex in Fig. 30 is placed at the origin of the coordinate system. 

In the general case when a =+ 0, b and c are arbitrary numbers 
the parabola is obtained from the parabola depicted in Fig. 30 by 
the operations of uniform expansion and parallel translation. To 
determine the position of the vertex we can apply the so-called me- 
thod of completing a square which we shall demonstrate here by 
considering a concrete numerical example. Let the quadratic func- 
tion y = 2z? — 3x + 1 be given*. Then we perform the following 


* In practice we usually have quadratic functions (trinomials) whose coeffi- 
cients are approximate numbers (in contrast to the above trinomial with exact 
coefficients). For instance, we can take the trinomial y = 2.1722 — 3.217 + 0.84 
and the like. But if we investigate the case with exact coefficients we can 
easily pass to a more complicated case. The comment also refers to further exam- 
ples of this type. 
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simple transformations: 
/ 3 1 ¢ 3\2 3\2 1 3\2 4 
y=2(2—32+7)=2| (2-7) -(3) +z]=2(2-3) -7 
Consequently (see Sec. 19) we obtain the sought-for graph from 


the parabola depicted in Fig. 30 by translating the parabola + unit. 


of length to the right, 
expanding it along the 
direction of the y-axis with 
the two-fold increase of the 
distances from the points 
of the graph to the z-axis 
and, finally, by transla- 


ting E unit downwards. The 
graph thus obtained is 


y 


(a<0) 


Fig. 32 Fig. 33 


shown in Fig. 34. To construct a more accurate graph we cam 
additionally take several values of z and determine the cor- 
responding values of y which enables us to construct the corres- 
ponding points of the graph. For instance, we have y = 1 for 
z =Q, y =0 for z= 1 and y = 3 for z = 2; the corresponding 
points are indicated on the graph. The vertex of the constructed 


parabola is situated at the point M with the coordinates z=2 


and y= 2e . The parabola is “narrower” than the one depicted in 


Fig. 30 (with the same unit of length). 

Generally, the greater |a |, the narrower the parabola. If a < 0 
the branches of the parabola are open downwards (see Fig. 32). In 
case a = 0 the quadratic function turns into a linear one. 

24. Power Function. The general form of a power function is 


Iar 
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If 0 < x< 1 then the greater n, the smaller the values of the func- 
tion. But if z >1 then the greater n, the greater the values of the 
function. In Fig. 33 we see the graphs for n = 1, 2, 3 and 4. While 
constructing the parts of the graphs of y = z” for the values z < 0 
one should take into account that the function y = 2” is even for 
even 7 and odd for odd n. In particular, let us consider in detail 
the graph of the function y—z* (the cubic parabola). The graph 
is convex upwards (or concave downwards) for z < 0, that is it lies 
under the“tangent drawn at any of its points. For z >0 the graph 


Fig. 34 


is convex downwards (or concave upwards). If we pass from left to 
right through the origin of the coordinate system the direction of 
convexity changes to the opposite one. The tangent to the graph 
at the origin coincides with the z-axis but at the point of tangency 
O the curve passes from one side of the tangent line to another. Such 
points are called the points of inflection of a curve. Thus, the cubic 
parabola has one point of inflection. Among the well-known curves, 
the sinusoid, for example, has points of inflection. For fractional 
values of the exponent n the graphs are placed between the corres- 
ponding graphs for integral values of n. But in the case of a fractio- 
nal n one should be careful when constructing graphs: a negative 
number raised to a fractional power may result in an imaginary 
number and therefore in such a case we must not construct. the 
graph for z < 0. 
1 


Let us consider the case 0 < n < 1. For instance, let n = —, 

1 
that is y =x? =V z. Then, as it was shown in Sec. 20, the graph 
is the upper half of the ordinary (quadratic) parabola with the sym- 
metry axis coinciding with the z-axis (see Fig. 34). The graphs of 
power functions for some other fractional n are also shown in Fig. 34. 
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In case the fraction representing n has an odd denominator the graph 

exists not only for z >0 but for z < 0 as well because, for negative 

numbers, we can extract real roots with odd indices of radicals. In 
2 


particular, let us take the graph of the function y = x3 (the semi- 
cubical parabola) depicted in Fig. 35. The graph first approaches 


the origin of the coordinates (for example, when wë baat iomAeit 
to right) and then departs from the origin. At the origin this curve 
has the so-called spinode, or cusp. Later on we shall investigate 
some other curves having cusps. 

Finally, let us take the case of a negative n (n= — m<0). 
Then y = = and therefore we have very large values 


of |y | for very small |x | and vice versa. The corres- 


Fig. 36 Fig. 37 


ponding graphs are shown in Fig. 36 for z>0; we leave to the 
reader the construction of the parts of the graphs corresponding to 
x< 0. All these graphs stretch along the coordinate axes and 
approach them unlimitedly as z or y approaches infinity. Gene- 
rally, when a curve and a straight line have a mutual disposition 
of such a kind the straight line is called the asymptote of the curve. 
Hence, each of the above graphs has two asymptotes which are the 
coordinate axes. 


5—0141 
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One must not think that a curve cannot intersect its asymptote 
in all other cases. For example, investigating the so-called damped 
oscillations we obtain a graph of the form shown in Fig. 37. Here 
the a-axis is the asymptote of the graph which intersects it 


infinitely many times. 
25. Linear-Fractional Function. A linear-fractional function has 


the general form 


az-+-b 
ee (11) 


In the simplest case when a = d = 0 we obtain, denoting i = h 


the expression y = s which describes the inverse proportional 
relation. The corresponding graph, as is well known from elementary 


y y 


x 


Fig. 38 


‘ 


mathematical courses, is called a hyperbola. The graph is depicted in 
Fig. 38 for the two cases k >0 and k < 0 separately. The function 


y= £ being odd, the hyperbola has a centre of symmetry (which 


is located at the origin of coordinates in Fig. 38). It has two asympto- 
tes (which are the coordinate axes in Fig. 38). We shall prove in 
Sec. 11.13 that a hyperbola has two symmetry axes (for the hyper- 
bola in question the axes of symmetry are the bisectors of the angles 
between the coordinate axes in Fig. 38). 

In the general case the graph of a linear-fractional function is 
also a hyperbola which can be obtained by a parallel translation 
of the hyperbola depicted in Fig. 38. We shall illustrate this by 


taking a concrete numerical example. Let y = -F We now 
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carry out the following simple transformations: 


bebe Mea lick | cp he Tekst ee Te 
3 (2-3) z=—7 2-7 
19 19 
2 6 9 2 
-3(1+5]- Ar 
3: E 


Thus we conclude that the desired graph can be obtained from 
19 


the graph of the function y = = by translating the latter 4 units 


2 s 
of length to the right and =z units upwards. Hence, we get a hyper- 


bola whose centre of symmetry is at the point s = 2 Sy, =4 (see 


Fig. 39). 
A linear-fractional function of general form (11) has a point of 
discontinuity at x -_4 because the denominator vanishes at 


the point. This accounts for the fact that the graph consists of two 


separate portions (see Sec. 16). i 
5# 
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26. Logarithmic Function. A logarithmic function is a function 


of the form 
y = loga T (12) 


It is defined only for z => 0, and we consider the bases of logarithms 
a>0 (a 1). The graphs of logarithmic functions are shown for 
different bases in Fig. 40. They have neither symmetry axes nor 
centres of symmetry but have an asymptote which is the y-axis. 
All the logarithmic functions are proportional to each other since 
taking logarithms to the base b of both sides of the equality a 1084“ — 
= 7 we get 


log, £= lOga T -logy a = k loga £ (r= logs a= z) (13) 


Therefore we can obtain all the graphs depicted in Fig. 40 by 
expanding or contracting one of them along the direction of the 
y-axis with a uniform increase or, 
respectively, with a uniform dec- 
rease of the distances of the points 
of the graph from the z-axis. Now 
a=4 let us consider the angles of inter- 
section of the graphs with the 
a-axis. According to the general 
definition of the angle between inter- 
secting curves as the angle between 
the tangents to the curves at the 
point of their intersection, we mean 
1 here the angles formed by the tan- 
SE gents to the graphs with the z-axis. 
When the graph is expanded or 

Fig. 40 contracted in the way described 

above the tangent rotates about 

the point of intersection. We see that the tangent has a very slant 
inclination (to the z-axis) for very large values of a and a very 
steep inclination for values of a close to 1. For a certain value of a 
the angle of intersection of the graph of logarithmic function (12) 
with the z-axis is equal to 45°. This value of a is denoted by the let- 
ter e. It plays an important role in mathematics as we shall see later. 

We see in Fig. 40 that the angle of intersection is greater than 45° 
for a = 2 and smaller than 45° for a = 4; hence, the number e 
lies between the limits 2 and 4. More accurate calculations which 
will be described in Sec. IV.16 show that e = 2.71828 with an 
accuracy of 10-5. The notation e for this number was introduced by 
Euler. 

Logarithms to the base e are called natural logarithms [Napierian 
logarithms after the Scottish mathematician J. Napier (4550-1617)]. 


a=1,2 
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They are denoted as In z = loge z. The graph of the natural loga- 
rithm is shown in Fig. 41. A logarithm to any other base can be 
expressed in terms of the natural logarithms in accordance with 
formula (43): 

Inz 
Ina 


loga T= (14) 
Hence, the formulas for passing from the common (decimal) loga- 
rithms to the natural ones and vice versa are 


log z = 0.4343 In z and ln z = 2.303 log x 


where the values of the proportionality factors are accurate to four 
decimal places. 

Besides the natural logarithms we also use the common logarithms 
(in numerical calculations) and the logarithms to the base 2 (in 
information theory and some other branches of modern mathematics). 


1% 
4 


a=4 


a= 


Fig. 41 Fig. 42 


27. Exponential Function. An exponential function is a function 
of the form 
y =a" (15) 


The function is defined for all z, and we always consider the values 
a > 0 (because raising a < 0 to a fractional power may result in 
an imaginary number). Equality (15) can be obtained from formu- 
la (12) if we solve it for z (which yields z = a”) and then inter- 
change z and y. Consequently (see Sec. 21) the exponential function 
and the logarithmic function are the inverse functions with respect 
to each other. 

Therefore the graphs of exponential functions which are depicted 
in Fig. 42 for different bases a are obtained as the mirror images of 
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the corresponding graphs (shown in Fig. 40) with respect to the bi- 
sector of the angle between the coordinate axes. If a > í the expo- 
nential function is an increasing function, and the greater a, the 
greater the rate of its increase. In case 0 < a < 1 the exponential 
function is a decreasing function. 

We often deal with the exponential function with a = e. In this 
case there is special notation: y = e" = exp z. 

Any exponential function with an arbitrary base a can be reduced 
to the base e; indeed, the definition of a logarithm implies that 
a=eln@ and therefore a* = (e!™2)* = e** where k = Ina. 

28. Hyperbolic Functions. The hyperbolic sine, cosine and tan- 
gent are, respectively, the functions 
ex —e-x 


2 


ex e-x 


sinh z = Tae? 


sinh x eX —e-* 
S tanh r= =- 


ı coshr= cosh x ext ex 


At first these terms sound strange but their genuine sense (for exam- 
ple, the connection between sin z and sinh z or between the hyperbolic 
functions and a hyperbola) will be explained only when we get 
to Secs. VIII.4 and XIV.8. Let us now establish some formulas 
connecting these functions. Squaring the first two equalities we gel 


Aa 2: 2x1 9-1 p-2x 
n ad Geshe RTE 


14 A 
Now subtracting and adding these two formulas we obtain¥ 


. e 
sinh? z= 


cosh?z—sinh?z=1, cosh? z- sinh? bese eae = cosh 2x 
The obtained formulas indicate a significant analogy between hyper- 


bolic and trigonometric functions. We leave to the reader the de- 
duction of the formulas 


sinh 27 = 2 sinh zx cosh z, 
sinh (a + b) = sinh a cosh b + cosh a sinh b, 
4 
4 — tanh? z = aie 
and other similar formulas. Note that sinh 0 = 0 and cosh 0 = 1. 


The functions sinh z and tanh z are odd whereas the function cosh z 
is even; indeed, for example, 


A elx) — e(-*) e-X — ex ex —e-x ; 
sinh (— 2) = ——_— = 4 = > 7 = —sinhz 


The construction of the graphs of sinh x and cosh z is illustrated 
in}Fig. 43. The graph of tanh z is shown in Fig. 44. To construct 
this graph we find its points with the help of the graphs of sinh-x 
and cosh z. It is clear that the hyperbolic functions do not possess 
the most important property of trigonometric functions, namely, 
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the periodicity. Besides, the range (see Sec. 15) of each hyperbolic 
function considerably differs from the range of the corresponding 
trigonometric function. 


Fig. 43 


The graph of tanh z has two asymptotes since, for large values of 
|x|, we have e-i"! < 4 < eli (the sign < means here “much 
smaller”) and therefore tanh t ~ 1 for large |x| and z >0, and 
tanh z ~ —1 for large |z | and z < 0. 


Fig. 44 


We sometimes consider the inverse hyperbolic functions which 
are denoted as sinh-? z, cosh! z and tanh™ z, respectively. Figs. 43 
and 44 show that the first and the third functions are single-valued 
(compare with Fig. 25) whereas the second one is two-valued. All 
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these functions can be expressed in terms of logarithms. In fact, let, 
for example, y = sinh-! z. Then, by the definition of an inverse 
function, we have 


P eY —e-Y 
x= sinh y = 


i.e. 

ev e oT — ADO, N — O74 _ 4 — 0 
which implies e = z + V x? + 1. The left-hand side being positive, 
the right-hand side should also be positive. Therefore we can take 
only “-+” in front of the radical. Now taking logarithms we obtain 


y =sinht z = ln (z + V 2 F 1) (16) 


29. Trigonometric Functions. The function y=sinz with 
period 27 is well known from courses on trigonometry. Its graph 
(the sinusoid, the sine curve) is represented in Fig. 45. The function 


Fig. 45 


is odd, has no points of discontinuity and is bounded (its values 
lie between the limits —1 and +1). We have cos x = sin (z + 7) 
and therefore the graph of the function cos z is the same sinusoid 
but translated = units of length to the left; this graph is also shown 


in Fig. 45. In many applications we encounter a sinusoidal, “harmo- 
nic” relation of the form 


y = M sin (ot + a) (17) 


where the independent variable ¢ is interpreted as time, the con- 
stant M > 0 is called an amplitude and oœ > 0 is called a frequency 
(circular, angular frequency). The sum wt -+æ is called a phase 
and the constant œ is an initial phase which is obtained from the 
phase by substituting ż = 0 for t. We can easily investigate in what 
way parameters M, œ and œ affect the form and the disposition 
of the sinusoid (compare with Sec. 19). The amplitude M increases 
the range of the sinusoid and brings it from —M to M, the frequency 
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œ changes the period 2x into T = and the presence of the initial 
phase œ displaces the sinusoid to the left by the distance = [since 


ot +a=o ltt 2 \ and therefore the value Č is added to 
o W 


the argumonii The graph thus obtained is represented in 
Fig. 46. 

A function of form (17) is obtained, in particular, when we trans- 
form the expression A cos œt + B sin wt. The right-hand side 
of (17) can be rewritten in 
the form M sin æ -cos wt + 
-+M cos a-sin wt and thus 
in order to obtain the 
equality 


A cos ot + B sin ot = 
= M sin (at + a) (18) 


we must have A = M sin a 
and B = M cosa. From 
this it is easy to find M 


and a: M = y A? + Band 
tana = $i the quarter in 
which œ should be taken is 
defined by the signs of sin æ and cos œ, i.e. by the signs of A and B. 

In case the independent variable is interpreted not as time but 
as a geometrical coordinate the sinusoidal relation is usually written 
in the form y = M sin (kx + æ) instead of (17). In this case k 


Fig. 46 


is called a wave-number and A = is a wave-length. 
The function y = tanz has the period m since tan (x + 1) = 


F, 
= tan z. It has’ the points of discontinuity at z = wi = Hon, 


$ — m, ... (this can be written in the general form as z = a4 ke 


where k = 0, +1, +2, ...). Indeed, at these points cos z = 0 
and therefore tan x = --oo. The graph of the function (the tangent 
curve) is represented in Fig. 47; it consists of an infinitude of similar 
components and has infinitely many asymptotes. The graph of the 
function y = cot x is also shown in Fig. 47. We have 


cot z = — tan (z— =) 
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and therefore the curve has the same form but its disposition is 
changed in the corresponding way. 

The function y = Arc sin z is the inverse function of y = sin x 
and therefore its graph (see Fig. 48) is the mirror image of the graph 
of y = sin z about the bisector of the first quadrant angle. This 
function is multiple-valued (more precisely, infinite-valued) and 


Y4 y=tan x 


Fig. 47 Fig. 48 


therefore (see Secs. 20-21) one usually considers its principal branch 
(the principal value of the arc sine) which is shown in Fig. 48 in 
heavy line; this branch is denoted as 
y =arc sin z, — 4 <ar sin z< 

and is a single-valued function. Other branches of the function have 
no special names. “4 

The functions y = Are cos z and y = Arc tan v can be investiga- 
ted in a similar way and we leave this to the reader. 

We should note in conclusion that we shall always deal with 
dimensionless (abstract) values of arc sin z. For example, we have 


gare sin 1 T EPA: N fa 


Similarly, the values of the function y = sin z are taken for di- 
mensionless values of z. We mean here that the sine of a number x 


* The last two equalities are approximate. If one intends to stress this 
n 


fact one writes 22 ~ 2!-57, We are not going to mention stipulations of this 
kind in the future. 
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is the sine of the angle of z radians. For instance, sin 1 = sin 57°18’= 
= 0.8415. 

30. Empirical Formulas. We have already mentioned (see Sec. 13) 
that an experiment often results in a function y = f (x) which we 
are interested in and represents the function in a tabular form (2). 
In such a case the problem of selecting an appropriate empirical 
formula for the function may arise. We usually begin with repre- 
senting the values of the function on the graph paper or some other 
appropriate paper. Then we select a certain form of the formula we 
are going to use. If the form is not implied by general considerations 
we usually choose one of the functions described in Secs. 22-29 or 
a simple combination of such func- 
tions (a sum of power functions y a 
or of exponential functions and the 
like). In order to select such a for- 
mula in the best'way one must know 
the graphs of these functions well. 
When selecting a function we must 
iry to achieve the resemblance bet- 
ween the characteristic peculiarities 
of a sought-for function @ (2) and 
of the function f (x) under consi- 
deration. For example, if the phy- 
sical meaning of the function indi- Fig. 49 
cates that f (z£) is even and f (0) =0 
then the function ọ (z) should also have these properties and so on. 
It sometimes turns out that we cannot find a single formula for the 
whole interval of z. Then it is necessary to divide the interval into 
several parts and select, for each of the parts, its own appropriate 
formula. 

After the form of the formula has been chosen it is necessary to 
determine the values of parameters entering into the formula. 

For example, suppose that after plotting the points we obtain 
the drawing shown in Fig. 49. If we have certain reasons to suspect 
that the experiment or the calculations of the values of the function 
. could contain essential errors we must simply discard the points 
which fall out of the general form of the relationship described by 
the data represented in our drawing. For instance, the point P 
in Fig. 49 is a point of this kind. By the way, such points may some- 
times indicate that certain important factors were not taken into 
account and then, of course, we must pay much attention to them. 

The remaining points in Fig. 49 resemble a linear relation of the 
form y = ax + b. In order to determine the parameters a and b 
let us draw a straight line such that the experimental points should 
lie as close as possible to the line. This can easily be done by means 
of a transparent ruler. We apply the ruler to the drawing and then 
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approximately find the sought-for position of the ruler. For example, 
the straight line drawn in Fig. 49 yields b = 0.50 and a = =f. = 
= 0.58, i.e. y = 0.58% + 0.50. 

The selection of a linear relation described above is comparatively 
simple. Therefore when choosing some other kind of functional 
relation one often tries to introduce new variables so that there 
should be a linear relation between the new variables and then to 
determine the parameters entering into the linear relation. We can 
apply this method only if there are no more than two such parameters « 
since a linear function contains two parameters. 

For example, let an experiment yield the following table of aiias 


0.20 | 0.30 


0.70 


0.00 | 0.10 0.80] 0.90) 1.00 


0.40 | 0.50 | 0.60 


y | 0.00 0.08 | 0.17 0.66 


0.04 | 0.03 


0.29 | 0.45 


o.or| 1.22 


We leave to the reader to represent the experimental points on 
the graph paper. The disposition of the points thus constructed 
resembles the graph of a power function of the form y = ax". In 


Fig. 50 


order to determine the parameters a and a we take logarithms of 
both sides of the equality and denote log y = Y, log z = X and 
log a = A. Then we arrive at the equality Y = «aX + A and thus 
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we see that there is a linear relation between the new variables. 
By means of a table of logarithms we compile the table of the values 
of the new variables: 
—0.52 


—0.70 —0.40|—0.30 |-0.20|-0.15|-0.10 


x Ht —0.05 |oo 


¥ —1.4 -0.17|-0.54|-0.99 —0.18 |—0.044 0.085018 


The points thus obtained are lying close enough to the straight 
line drawn in Fig. 50. In drawing the straight line we should 
pay more attention to the last three points whose positions are 
determined with greater accuracy. Our construction yields the values 
A = 0.196 and æ = 2.44, that is a = 1.57. Hence we finally obtain 
y = 1.57a?-44. 

Some further rules and examples see in [54]. 


CHAPTER II 


Plane Analytic Geometry 


Analytic geometry is a branch of mathematics in which geometri- 
cal problems are investigated on the basis of the coordinate method 
by means of algebraic techniques. 


§ 1. Plane Coordinates 


1. Cartesian Coordinates. Cartesian coordinates are known from 
elementary mathematical courses and we have already used them 
(see Chapter I). Cartesian coordinates are called after R. Descartes. 

R. Descartes and P. Fermat 
y (1601-1665) are the founders 
of the coordinate method. 


PZ, 2.2) y oM (x, y) Several points are depi- 
? Z cted in Cartesian coordina- 
* tes in Fig. 51. It should be 

i 


9V(3,1) noted that we take mutual- 
(0,0) eed 


ly perpendicular coordinate 

axes here and that the unit 

x of length is the same for 

both axes. The origin of the 

coordinate system is placed 

2 at the point of intersection 

of the axes from which the 

: distances along the axes are 

Fig. 54 reckoned. (As it was men- 

tioned in Sec. 1.14, when 

we construct graphs, we can sometimes take different scales for the 

axes and change the position of the point the coordinates are recko- 

ned from.) Hach point in the coordinate plane has certain uniquely 

determined coordinates and, conversely, to each ordered pair of coordina- 

tes x and y there corresponds a certain uniquely determined point of 

the plane. This basic property makes it possible to consider the 
coordinates of points instead of the points themselves. 
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The coordinate axes break the plane into the quarters (quadrants) 
which are numbered in the way shown in Fig. 52. Each of these 
quadrants is characterized by its specific combination of the signs 
of abscissas and ordinates; this is also shown in Fig. 52. 

2. Some Simple Problems Concerning Cartesian Coordinates. 

(1) The distance between two given points. Let the points M, (x, y4) 
and Ma (£a, Ye) be given (i.e. their coordinates are known). It is 
required to determine the distance d = M,M, (see Fig. 53). The 


Fig. 52 Fig. 53 


formula for the distance is implied by Pythagoras’ theorem applied 
to the rectangular triangle M,M,P. Thus we have M,M} = M,P? + 
+7PM}, ie. d = (£ — 1) + (Ya — ys)”, oF 


d = V @ =r) + Ye — n) (1) 


This formula and all the following formulas hold for any two points 
M, and M, placed in an arbitrary manner in the coordinate plane. 

(2) Division of a line segment in a given ratio. Let some points 
M, (zı, yı) and Ms (£2, y2) be given. It is required to find a point 
M (x, y) lying on the segment M,M, such that the ratio of division 
i should be equal to A where À is a given number (see Fig. 54). 
The solution of the problem follows from the similarity of the tri- 


angles M,PM and MQM, which implies “io = mar. =, ice. 


= =A and z — 2, = Ax, — Àx. From the last relations we 
lS 
deduce 
_ ttis ya Aye 
lee FI Pale. Reale BT (2). 


(the expression for y is obtained in like manner). In particular, in 


. the case A = 1, that is when the segment is halved, we have 


_ Utat _ uty: 
ERS m a 


y 2 
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(3) Transformation of coordinates without changing the scale. Suppose í 
we have an “old” coordinate system z, y. Now let a “new” coordinate 
system z’, y’ be introduced. It is required to establish the relation- 


Fig. 54 


ship between the old coordinates and the new ones. We shall con- 
sider the following three cases. ` 

I. Let the new coordinates be the result of a translation of th 
original ones. Suppose the new origin of the coordinate system has 


x 0| & oe 
Fig. 56 Fig. 57 
the coordinates (a, b) with respect to the original coordinate system. 
Then, as it is implied by Fig. 55, we have 
z=7 +a, y=y +b 
II. Suppose the new coordinate axes are the mirror image of the 


old ones (for example, the mirror image about the y-axis). Then, 
as it is seen in Fig. 56, the relationship is 


r=—2', y=y’ (3) 


III. Let now the new axes be the result of turning the old axes 
about the origin of the coordinate system through an angle œ (see 
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Fig. 57). Then the equalities OC = OD — AB and CM = DB + AM 
imply 


(4) 


The general case of passing from one Cartesian system to another 
is a mere combination of the above basic cases. 

3. Polar Coordinates. Besides Cartesian coordinate systems we 
can construct many other different coordinate systems, that is there 


z=z2' cosa — y' sing 
y =2' sina + y' cosa 


M 
Q(2,480°)  pP(3,45°) 
eet 
0 7 2 3 p 0(0,100°) P 
Fig. 58 Fig. 59 


are many ways to determine the position of a point in a plane by 
means of two numerical parameters (coordinates). An appropriate 
system should be chosen for each occasion, Cartesian systems being 
applied more often than others. In this section we shall consider 
only one of the systems, namely, the 
so-called polar coordinate system which 
is particularly convenient for inve- 
stigating rotary motion. In order to 
construct polar coordinates we arbitra- 
rily choose a pole O and a polar axis Op 
(see Fig. 58). Then the position of 
a point M can be characterized by its 
polar radius (radius-vector) (i. e. 
the distance from O to M) and its Fig. 60 

polar angle (vectorial angle) ọ (which 

is also called the phase or the amplitude of the point M). p and @ 
are called the polar coordinates of the point M. The vectorial angle p 
is considered positive if it is reckoned in the positive direction 
(which is usually taken counterclockwise) and negative if other- 
wise. Several points with given polar coordinates are constructed in 
Fig. 59. In particular, we see that the pole has the zero radius- 
vector while its vectorial angle is undetermined and can be chosen 
in an arbitrary way. In order to describe all the positions a point 
can occupy in a plane it is sufficient to take the angle » within the 
limits —180° < p < 180° but it may sometimes be convenient 
to consider the values of @ which fall outside the interval. The 


6— 0141 
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addition of 360° to the vectorial angle of a point does not change its 
position. 

The relationship between the Cartesian coordinates and the polar 
coordinates of a point in case the systems are placed as it is shown 
in Fig. 60 has the form 


z = pcos, y=psing and, conversely, 


p=VFFP, tangas (5) 


T 


§ 2. Curves in Plane 


4. Equation of a Curve in Cartesian Coordinates. As it was shown 
in Sec. I.20, an equation of the form 


F (z, y) =0 (6) 


defines a curve (L) in the z, y-plane (i.e. in the plane where the 
coordinate system is taken). The curve is the locus of all the points 
whose coordinates satisfy equation (6). The relation of form (6) 
is called the equation of the curve (L). 

Conversely, if a curve (L) is given in 
the z, y-plane then its geometrical pro- 
perties can be expressed in terms of 
some analytic relations between the coor- 
dinates of its points, and thus an equation 
of the curve (L) is obtained in form (6). 
(It should be taken into account of course 
that every equation can be rewritten in 
different equivalent forms.) Consequently, 
the coordinate method enables us to con- 
sider the equations of curves instead of 
the curves themselves. Thus, geometrical 
problems can be reduced to algebraic 
ones, and the latter, as a rule, can be 
solved in a simpler and more uniform way than the former. For 
example, in order to verify whether a curve with an equation of 
form (6) (or, as we simply say, “the curve F (z, y) = 0”) passes 
through a point (a, b) it is sufficient to substitute the coordinates 
of the point into the equation of the curve and verify whether the 
equation is satisfied, i.e. whether we have F (a, b) = 0 

As an example, let us deduce the equation of a circle (see Fig. 61). 
Let the centre A of the circle have the coordinates (a, b) and let. 
M (x, y) be an arbitrary (moving, current) point of the circle. Then 
the basic property characterizing a circle can be written as AM = R 
where R is the radius of the circle. Now, applying formula (4) for 


Fig. 61 
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the distance. between two points we obtain 
Ve=a +u- =R 

or, squaring, we derive 
CU = by = Ri 


This relation is the one which is satisfied by the coordinates of all 
the points of the given circle and at the same time only the points 
belonging to the circle may satisfy this relation. Thus, this relation 
is an equation of the circle and a, b and R entering into the equa- 
tion are some fixed numbers (i.e. parameters which characterize 
the position and the size of the circle) whereas x and y are the current 
coordinates of a variable point of the circle. 

Now, contrary to the previous example, let an equation be ori- 
ginally given, for example, the equation 


z + y? — 3z + 4y¥—1=0 (7) 


Transforming the equation and completing the squares we obtain 


(2—3)'— (8) +0 43"-2-130, 


(2-3) +y+22°-7=0 


Consequently the given equation is the equation of a circle with 


centre at the point (1.5, —2) and of radius Ve = 2.69. 


If two curves with the equations F, (x, y) = 0 and F, (x, y) = 0 
are given we can pose the problem of finding the point of intersection 
of the curves. The sought-for point of intersection must belong to 
both lines simultaneously and therefore its coordinates z and y 
must satisfy the equations of both lines. Hence, in order to determine 
the coordinates we must solve the following system of equations: 


F, (z, y) =0 
tae a (68) 


Such a system of equations can have a number of distinct solutions 
and this number corresponds to the number of possible points of 
intersection. Of course, a solution is understood as a pair of certain 
values z and y satisfying (8). 

For example, let it be necessary to determine the point of inter- 
section of circle (7) with the straight line y = z + b where b is 
a constant. To do this we should solve the system of equations 


z? + y? — 3r + 4y —1=0 
=az+b \ 


6* 
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Expressing y in terms of z from the second equation, substituting’ 
the expression thus obtained for y into the first equation, removing 
brackets and solving the quadratic equation in z we obtain after 
some transformations 


—1—2b-+ VI—28b— 4 get —1+2b-+4+ V9—286— 483 
[ei adit Ce A ai Aaa ee Se aaee 


“= Zz ; 
Ae —1—2b— //9—28b— 402 ý —41+2b— //9— 28b— 402 
=a eee pat 


4 n $ 4 


Fig. 62 


Let us find for what values of b both points of intersection coincide. 
This will happen in case the expression under the radical sign vani- 


shes which implies bi, = paleo a! , ie. b, = 0.31 and b, = 


= —7.31. The straight line y = z + b, as it is shown in Fig. 62, 
is tangent to the circle for these values of b. If b} <b < b, then 
there are two distinct points of intersection: (x,, y,) and (£a, Yə). 
The straight line does not intersect the circle at all when b lies 
outside the above interval (in this case the expression under the 
radical sign is negative). 

It may be noted that in many different problems the coincidence 
of the points of intersection whose coordinates are found from a 
system of type (8) usually indicates that the two given curves are 
tangent to each other at this common point, that is they have a com- 
mon tangent line at the point. 

5. Equation of a Curve in Polar Coordinates. An equation con- 
necting the plane coordinates of points taken with respect to any 
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given coordinate system defines some curve in the coordinate plane 
(with some exceptions to this rule which will be discussed in Sec. 8). 
In particular, let us consider a polar coordinate system. We suppose 
that the equation is solved for p, i.e. it has the form 


p =f (9) (9) 


Making @ assume all the possible numerical values and finding the 
corresponding values of p we receive the locus of these points which 


f 
K 


E 


Fig. 63 


form a curve in the plane, that is the graph of function (9) in the 
polar coordinates. 

Let us take two examples. The graph of a linear relation of the 
form p = ap + b is shown in Fig. 63. This curve is called the spiral 
of Archimedes. It can be obtained by the superposition of a uni- 
form rotary motion of the radius-vector and a uniform rectilinear 
motion along the radius. Indeed, if 


p=v +b and p=ot then p==9+b 


Thus we see that the graph of one and the same function (a linear 
function in this particular case) regarded with respect to a Cartesian 
coordinate system and to a polar coordinate system has quite diffe- 
rent forms (see Sec. 1.22). 

The graph of the exponential function p = e"? in polar coordi- 
nates is depicted in Fig. 64. This curve is the so-called logarithmic 
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spiral. The curve infinitely winds round the pole (and tends to the 
pole as p — —oo) but never reaches it. 

The logarithmic spiral has some interesting properties. For exam- 
ple, if we perform a similarity transformation, that is if we expand 
the spiral uniformly in all directions, with the similarity coefficient 
m (i.e. all the linear sizes are increased m times) we shall obtain a 


In m 
t ‘ kota 
new curve with the equation p = me"?. But p = me}? = e (e+ ae 


= e(9+2) where a = =e and therefore the result would be as 


Fig. 64 


if we turned the original spiral about the pole through the angle 
of œ radians in the clockwise direction. Indeed, generally the graph 
of a function p = f (p + a) is obtained from the graph of the func- 
tion p = f(g) by rotating the latter about the pole through the 
angle œ in the negative (i.e. clockwise) direction. (Why is it so?) 
Consequently, the logarithmic spiral is similar to itself with any 
arbitrary similarity coefficient. Only straight lines possess this 
property among all the other curves in a plane. 

In conclusion, let us discuss the notion of coordinate curves. 
A curve is called a coordinate curve if one of the coordinates remains 
constant along the curve. The coordinate curves of a Cartesian coor- 
dinate system are two families of straight lines parallel to the coor- 
dinate axes. The curves 9 = const which are concentric circles with 
centre at the pole form a family of coordinate curves in a polar 
coordinate system. The second family of coordinate curves in a polar 
system are the curves ọ = const, that is the family of half-lines 
issued from the pole (see Fig. 65). 
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6. Parametric Representation of Curves and Functions. There are 
some cases when both coordinates of a point (for instance, the Car- 
tesian coordinates) belonging to the coordinate plane are represented 
as certain functions of a third variable which we denote by t: 


z=), y =Ņ (8 (10) 


This third variable serves as a parameter determining the position 
of a point having the coordinates (z, y) in the coordinate plane. 
When ¢ varies the corresponding point moves in the plane describing 
some curve (L) (see Fig. 66). In this case we say that the curve (L) 
is represented in parametric form (10). We have such a situation when, 


x =const 


Fig. 65 


for example, we investigate a motion of a point in the plane. In 
this case ¢ is time, formulas (10) determine the law of motion and 
the curve (L) is called the trajectory of motion. In other problems 
of this kind a parameter entering into equations may have some 
other physical or geometrical meaning but even then it is usually 
convenient to regard it as if it were time. It should be noted that 
one and the same curve (L) can be represented by many different 
. forms of equations (10) since there can be different laws of motion 
along one and the same trajectory (for instance, students walking 
from a bus stop to their college can come 22 minutes or 2 minutes 
before the lecture begins). 

In order to pass from equations of a curve given in form (10) 
to an equation of general form (6) we must eliminate the parameter 
from both equations (10). For example, we can express ¢ in terms 
of z from the first equation and then substitute the result into the 
second equation. Of course, we can use any other procedure which 
eliminates ż. But this is not always possible and besides we can 
sometimes find it inexpedient to do so. Therefore we often retain 
the parametric form of representation. 
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Equations (10) define a certain functional relation y (x). Indeed, 
if we, for example, make z take on some value then the first equation 
defines some value of ¢ (of course, there may be more than one such 
value) and the second equation defines some value (or several values) 
of y. We see that the function y (x) turns out to be represented in 
a parametric form and the curve (L) is its graph. 

Example 1. Let us consider the motion of a shell fired with the 
initial velocity vp at an angle œ to the horizon (see Fig. 67). We 
shall disregard air resistance, sphericity of the earth and the earti’s 


Fig. 66 Fig. 67 


rotation. Then the horizontal component of the velocity must be 
permanently constant and equal to Vox = Vo cosa whereas the 
vertical component of the velocity changes all the time. The ver- 
tical motion has the constant acceleration of gravity g and therefore 
the distance passed along the vertical differs from the vertical dis- 
placement which corresponds to the constant speed Voy = vo sin œ 


by = (this fact is well known from mechanics). Thus, we obtain 


the law of motion in the z, y-plane: z = (vp cosaæ)ż and y = 
2 

= (vo sin a) t — 5. This is the law of motion that was sought 

for and at the same time it defines the trajectory of motion in the 

parametric form. Eliminating t we deduce 
2 pe oS 

y = (tang) a 2v? cos? a (11) 

The dependence y (x) being a quadratic one, the trajectory is a para- 
bola (see Sec. 1.23). 

Example 2. Let us now consider the trajectory of a point of a circle 

rolling upon a straight line without sliding (the dotted line in 
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Fig. 68 shows the position of the circle at the initial moment ¢ = 0 
whereas the continuous line indicates some moving position). We 
shall regard the angle of rotation of the circle (denoted by p) as 
a parameter. Then 


z = 00 =OT — QT = UMT — MP = Ry — 
— R siny = R (p — sin p) 
y=QM = TN — PN = R — R cos} = 
= R (4 — cos p) 


(12) 


where the line segment OT is equal to the arc length of UMT accord- 
ing to the condition of the absence of sliding. Hence, we have obtain- 


Fig. 68 


ed the parametric equations of a curve which is called the cycloid- 

The curve is infinite and has cusps which are seen in Fig. 68. 
The cycloid is one of the simplest curves belonging to the class 

of roulettes; a roulette is the path in a fixed plane of any point in 


prolate cyclord 


Fig. 69 


a moving coincident plane when a given curve in the latter plane 
rolls without sliding on a given curve in the former. The so-called 
curtate and prolate cycloids (see Fig. 69) which are described by 
a point rigidly connected with the plane of a circle and lying, res- 
pectively, inside or outside the circle when the latter is rolling 
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upon a straight line give further examples of roulettes. By the way, 
the prolate cycloid, as it is seen in Fig. 69, has nodal points. Other 
examples are the hypocycloid and the epicycloid which are described 
by a point of a circle rolling upon the other circle inside or outside 
it, respectively (see Fig. 70). A hypocycloid with the ratio of the 
radii of the circles equal to 4 : 4 is called the astroid. In case this 


Epicyclota 


Fig. 70 Fig. 74 
Astroid. Show that the equality 
uU TM =u TA implies the equa- 
tions x = a cos? t, y = a sin? t 


ratio is 1: 2 the kypocycloid turns into a straight line segment. 
The cardioid is an epicycloid with the ratio of the radii 4 : 4. These 
curves are depicted, respectively, in Figs. 71 and 72, where the 
equations of the curves are also put down. The deduction of these 
equations is left to the reader. All these curves are important for 
applications in the theory of mechanisms. 

7. Algebraic Curves. If the equation of a curve (L) has the form 


P (z, y) =0 (13) 


in Cartesian coordinates z and y where P is a polynomial of degree n 
the curve (L) is said to be an algebraic curve of the nth order. Non- ° 
algebraic curves are called transcendental. For example (see 
Sec. 1.22), the graph of a linear function, that is a straight line, 
is an algebraic curve of the first order, a quadratic parabola and 
a circle are algebraic curves of the second order, the cubic parabola 
and the semicubical parabola [the latter curve has the equation 
2 


y= xz 3 which should be rewritten in the form y? — z? = 0 in order 
to obtain an equation of form (13)] are algebraic curves of the third 
order, On the other hand, the sinusoid, the tangent curve and the 
graph of an exponential function are transcendental curves. 
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As an example of a curve of the fourth) order let us consider 
Cassinian ovals introduced by the Italian astronomer G. D. Cassini 
(1625-1712). Let us take two points F, and F, in a plane and form 
the product of their distances from a point M in the plane. The 
locus of all points M in the plane for which the product is a constant 
value is a Cassinian oval. Let us introduce Cartesian coordinate 


Fig. 72 
Cardioid. Show that the equa- 
lity u TM =u TỌ implies 
the equation p = 2a (1 — cos ọ) 


axes in the way shown in Fig. 73. Denoting the coordinates of the 
points F;, F, and M as F,(—a, 0), F,(a, 0) and M(x, y), respec- 
tively, we obtain, by formula (1), the equation 


Vet a+) (VE — ay) = & 


where b? is the given constant value of the product. The last equation 
implies the final equation 


(x? + y’)? is 2a?(x? we y’) — bt a at 


which can be easily deduced. Here we have an important special 
case when b = a and the curve is oo-shaped and called the lemni- 
scate. The lemniscate was discovered in 1694 by the Swiss mathe- 
matician Jakob Bernoulli (1654-1705). Passing to the polar coor- 
dinates by means of formulas (5) we deduce the polar equation of 
the lemniscate p? = 2a? cos 2g. In case 0 << b < a a Cassinian oval 
consists of two separate parts. 

It should be noted that when we speak about the degree of an 
algebraic curve we always mean an equation in Cartesian coordi- 
nates. For example, the equation of the spiral of Archimedes (see 
Sec. 5) has the first degree in the polar coordinates but if we rewrite 


it in the Cartesian coordinates we get V4 y? = a arc tan+ +b 
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and therefore we see that the spiral is a transcendental curve. Spi- 
rals of all other types are also transcendental and, generally, it 
turns out that all infinite curves possessing a periodicity property 
are also transcendental. 

The degree of an algebraic curve does not change if we replace a given 
Cartesian coordinate system by another one. 

Actually, for example, if we perform a parallel translation of 
a coordinate system (see Sec. 2) then equation of a curve (13) turns 
into 


‘P(2! + a, y’ +b) =0 


The degree of a polynomial which is obtained after removing 
brackets and collecting similar terms cannot become higher than 
the original one (obviously, terms of higher degree cannot appear 
here in this case). One may think that after collecting terms the 
degree of the polynomial could decrease in case the terms of higher 
degree mutually cancel out. But this cannot occur since otherwise 
the degree of the polynomial should increase under the inverse trans- 
ition from z’, y’ to x, y and this, as it was shown above, is impossible. 
An analogous situation takes place for all other types of transfor- 
mation of Cartesian coordinates. 

A change of a coordinate system for an immovable curve is 
equivalent to the motion of the curve (in the opposite direction) 
as the axes of coordinates remain immovable and therefore che 
degree of an algebraic curve is invariant (unalterable) when the curve 
moves as a whole. 

A quantity or, in general, an object which does not change under 
some transformations is called an invariant of these transformations 
(or an invariant with respect to the transformations). For example, 
the area is an invariant of motions and angles are invariants not 
only of motions but of similarity transformations as well. Thus, the 
order of an algebraic curve is also an invariant of motions. This 
very important concept of an invariant was unfamiliar to an owner 
of a garden who was painting the fence. Knowing that he did not 
have enough paint he was doing the job extremely fast so as to 
finish it before the paint was used up. 

8. Singular Cases. There are some equations of the form F(z, y) = 
= 0 which define in the z, y-plane such a set of points that it would 
hardly be possible to call it a line or a curve. We shall illustrate 
such cases by some examples. 

The equation z? + y? + 1 = 0 has no points in the coordinate 
plane which satisfy it since the left-hand side is positive for all 
z and y. An equation of this kind is said to describe an imaginary 
curve since if we use complex numbers we can put, for instance, 
z = i, y = 0 etc. But this term does not change the fact that such 
a curve does not exist as a real curve in the plane. 
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Only one point z = 0, y = 0 (the origin of the coordinate system) 
satisfies the equation 


e2+y=0 (14) 
Comparing this equation with the equation 
2+y—R?=0 (15) 


(see Sec. 4) we may consider equation (14) as if it defined a circle 
with zero radius. Generally, if an object depends on parameters 
and if it changes considerably and loses some of its essential pro- 
perties for certain values of the parameters we usually say that the 
object degenerates for these values. In the above case circle (15) 
depends on the parameter R. It de- 
generates into point (14) for R = 0 y 
and loses its main property, i.e. the yar 
property to be a curve. 

The equation of an algebraic curve 
of the second order of the form 


y? — z2 = 0 (16) gx 

can be rewritten as 
(y—2z)y¥+2)=0 

But a product is equal to zero if and 
only if at least one of its factors equals re 
zero. Therefore, either y—z=0, ; 
i.e. y=a, or y+x=0, i.e. y=— z Fig. 74 
(see Fig. 74). Each of these equa- 
tions determines a straight line in the z, y-plane. Consequently, 
a point satisfying equation (16) lies either on the first straight line 
or on the second one. Thus, a curve defined by equation (16) is 
nothing but a pair of straight lines or, as we say, it disintegrates 
into a pair of straight lines. Hence, we see that the algebraic curve 
of the second order disintegrated into two straight lines. 

But we can by no means regard a hyperbola as a curve which 
disintegrates (Sec. 1.25). In this case we have a curve consisting 
of two components (branches) and each of these components is 
a half of the hyperbola but both branches are described by one and 
the same equation. As for the case of equation (16), each of the 
straight lines obtained above has its own self-dependent equation. 
It appears quite clear that in this way we can artificially unite two 
arbitrary curves. If these curves have equations F, (x, y) = 0 and 
F, (z, y) = 0 then it is sufficient to take the equation 

F, (z, y)-F, (z, y) = 0 


We cannot regard as a disintegration of an algebraic curve the 
decomposition of the parabola (shown in Fig. 23) into its upper and 
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lower halves according to the formula 
y—2z=(y—V2)y+Va2) =0 (17) 


from which we obtain y;,2 = -—-V x instead of y? — z = 0. Here 
the key point is that we have a disintegration when a polynomial 
on the left-hand side of equation (13) is factored into polynomia!s 
whereas the factors entering into the right-hand side of (17) are 
not polynomials (see Sec. [.17). 


§ 3. First-Order and Second-Order Algebraic 
Curves : 


9. Curves of the First Order. As is shown in Sec. 7, in order to 
obtain an algebraic curve of the first order we must take a polyno- 
mial of the first degree and equate it to zero. Such a polynomial 
may contain only terms of the first degree and an absolute (con- 
stant) term. Therefore the general form of an equation of a curve 
of the first order is the following: 


Az+By+C=0 (18) 


There may occur two cases. If B + 0 then dividing the equation 
by B and denoting 


A Cc 
-5 =k, -=b (19) 
we receive A 
y =kr +b (20) 


We have shown (see Sec. 1.22) that this is the equation of a straight 
line (we had a instead of k but this fact is of no importance). This. 
line is depicted in Fig. 75. In case B = 0 we divide the equation by 


A and denote a = a. This yields an equation of the form z = @ 
which is the equation of a straight line parallel to the y-axis. It 


should be noted that for such a line the slope k = tan 5 = +00 
which also follows from relation (19) but in this case the equation 


cannot be written in form (20). 

Thus, the curves of the first order are straight lines. 

Let us consider several simple problems involving equations of 
straight lines. 

4. Let it be required to construct a straight line with the given 
slope k passing through the given. point (zı; y,). When we say “to 
draw” or “to construct” a straight line we mean, of course, in terms 
of analytic geometry, to write down its equation. The sought-for 
equation has form (20) but b entering into the equation is unknown. 
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But the straight line passing through the given point, the coordi- 
nates of the point must satisfy the equation of the line: y, = ka, + b. 
Subtracting and thus eliminating b we arrive at the sought-for 
equation: 

Y — y =k(z — z) (21) 
Making k change and take all the possible values we obtain the 
pencil of all possible straight lines passing through the point (ti; Y4) 


We can also put k = +-oo and thus obtain the vertical straight 


x=a =kx+b 


Fig. 75 Fig. 76 


line. But to do this we should divide both sides by k beforehand; 
then after the substitution of k = +o we simply obtain 0 = 
= © — %, i.e. z = qı. Similar precautions should be taken in other 
problems involving infinite values of parameters. 

2. Let it be required to draw a straight line through two given 
points (zı, y;) and (x, y). The equation of the desired straight 
line has form (24) but now & is unknown. But the condition that 
the straight line should pass through the second point implies y, — 
— yı = k(x, — x). Performing the division we thus eliminate k 
and receive 

y—n r—2 
Yo Yi non (22) 


In this equation and in equation (21) z and y are current coordinates 
of a moving (variable) point of the sought-for straight line. 

3. Let it be required to determine the angle formed by two straight 
lines with given slopes k, and ky. The solution is implied by 


Fig. 76: 
ah oe ___tan pa—tan qı — kə—kı 
e e n (PEO anp uno, LR (23) 


4. The condition for the parallelism of two straight lines is obvi- 
ous: a=k: 
5. The condition for the perpendicularity of two straight lines 


follows from problem 3: we have a = + for mutually perpendicular 
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straight lines but tan Z _ +o and therefore 1 -+ kika = 0 or 
2 2 


1 
k, = — Er 

40. Ellipse. An ellipse is the locus of all points in a plane for which 
the sum of their distances from two fixed points in that plane (these 
points are called the focuses of the ellipse) is a constant quantity. 
This definition enables us to use the method of drawing an ellipse 
with the help of a taut thread illustrated in Fig. 77. This procedure 


Fig. 77 


allows us to visualize the form of an ellipse: the ellipse is a closed 
convex curve possessing two symmetry axes (called the principal 
axes of the ellipse) and having the centre of symmetry O (called the 
centre of the ellipse). 

In order to deduce the equation of an ellipse in the simplest form 
let us place the coordinate axes in the way shown in Fig. 77 and 
denote F,F, = 2c and T, + Ta = 2a where c and a are constants. 
Then on the basis. of formula (1) we can write, according to the defi- 
nition of an ellipse, VEF Y+ + V@—o?+ y? = 2a from 
which we obtain, in succession, 


VeFOT aV e FTA. 
(zte) 4y = 4a — 4a V œ=] Fyt (e) Hy, 
ayu FF =a er, oiee) tyansa 
z? (a? — 2) +a?y? = a? (a° — cè) (24) 


It is seen, from the triangle F,MF,., that 2a > 2c, that is a? — 
— ¢2>0. Denote for brevity a — c = b?. Then the last of rela- 
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tions (24) implies the so-called canonical ‘form of the equation of 
the ellipse 


2 2 
ate! (25) 


This equation again shows that the coordinate axes serve as the 
symmetry axes of the ellipse because if a point (p, q) satisfies equa- 
tion (25) then the points (—p, q), (—p, —q) and (p, —q) also satisfy 
it (see Fig. 78). 


Fig. 78 


Putting y = 0 we get z = +a and, similarly, z= 0 yields 
y = +b. Hence, a and b are, respectively, the lengths of the semi- 
major axis and the semi-minor axis of the ellipse (see Fig. 78: AO = 
= OC =a and DO = OB = b). Besides, each summand entering 
into the left-hand side of (25) cannot be greater than unity and there- 
fore |z |<a and |y | <b. Consequently, the whole ellipse lies 
inside the rectangle depicted in Fig. 78. The points A, B, C and D 
at which the ellipse intersects its symmetry axes are called the 
vertices of the ellipse. An ellipse has four vertices. 

2 


The ratios = fay g = be ,O<e<i, is called the eccen- 


a 
tricity of the ellipse. This is a dimensionless quantity which does 
not change under a similarity transformation of the ellipse when 


3 = = m 5 ke 
all its sizes are increased k times since aTa 


The eccentricity of an ellipse characterizes its form (its “elong- 
ation”) but not its sizes. Several ellipses are depicted in Fig. 19; 
they all have the fixed length 2a of their major axes whereas their 
eccentricity © varies, c = ea and b = a V1 — e?. This enables us 


70141 
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to see how the eccentricity affects the form of the ellipse: the focuses 
draw together and the minor axis tends to the major one in its length 


Fig. 79 


as e decreases. Passing to the limit as e > 0 we have e = 0, c =0 
and b = a, that is we obtain a circle. Consequently, a circle may 
be regarded as a singular (limiting) 
case of an ellipse whose focuses 
merge and coincide with the centre 
of the circle; in this case the ec- 
centricity is equal to zero. On the 
contrary, if ¢ approaches 1 the 
ellipse becomes more and more 
elongated and it degenerates into 
a straight line segment in the 
limiting process. 

An ellipse can be obtained by 
a uniform contraction of a circle 
in a certain direction. Indeed, let 
us, for instance, consider the 

Fig. 80 uniform contraction towards the 

z-axis when all the sizes in the 

direction of the y-axis decrease k times (see Fig. 80). If a point 
M (a, y) lies on the curve obtained asa result of the contraction then 
the point N (z, ky) must belong to the circle: This implies x? +- 
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+ (ky)? = R? or a+ See 1, i.e. we get an ellipse with the 


(3) 


We can easily deduce now the parametric equations of an ellipse 
using the above proved property. In fact, the equations z = R cos t 
and y = Rsint (0<¢t< 2x) define the circle of radius R (R is 
given) with centre at the origin of the coordinate system. (Verify 


‘ z R 
semi-axes į R and =. 


Fig. 84 


this!) Performing the contraction we obtain z = Rcost and y = 


Deci If now we introduce the semi-axes a = R and b = - 
we finally derive the parametric equations of the ellipse: 
z=acost, y=bsint (O<St< 2x) (26) 


We shall show in Sec. XI.6 that a uniform contraction of an 
ellipse again yields an ellipse. 

As is known, an orthogonal projection of a plane figure results 
in a uniform contraction of the figure and therefore the orthogonal 
projection of a circle yields an ellipse (see Fig. 81). + 

We again obtain an ellipse when considering the section of a right 
circular cylinder or of a cone by a plane. Fig. 82 shows such a section 
for the case of a cylinder. In order to prove that the section repre- 
sents an ellipse we inscribe two spheres into the cylinder in such 
a way that they should touch the plane at the points F, and Fo. 
As is known, two tangents drawn to a sphere from a common point 
are equal. Therefore, we can write, for any point M belonging to the 
section, MF, + MF, = MN, + MN, = N,N, = const (see 
Fig. 82). This relation implies that our assertion is true. The cor- 
responding construction for the case of a cone is analogous. The 
properties of an ellipse are widely used in drawing. 

11. Hyperbola. We have already dealt with a hyperbola (see 
Sec. 1.25). But let us now forget about it for a while; later on in 
Sec. 13 we shall establish the connection between Secs. 11 and 1.25. 
Here we shall give a new definition of a hyperbola: a hyperbola is 


7* 
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the locus of all points in a plane for which the difference between their 
distances to two given points (called the focuses of the hyperbola) 
is a constant quantity. Choosing the coordinate axes as is shown in 
Fig. 83 and denoting FyF, = 2c and ry — r: = +2a we obtain the 
equation of the hyperbola in the form V@terRt+y— 
—V@—c?+ y? = +2a. Now carrying out some transformations 
similar to those in Sec. 10 we arrive at the same relation (24). (Check 
it up!) But in this case the triangle FMF, implies 2a < 2c and 


Iy r= 2a y 
(tzr; =2a) 


Fec, 0) wi 


Fig. 82 Fig. 83 


therefore we cannot denote a? — c? = b? as it was done in Sec. 10. 
(Why is it so?) Therefore we denote a? — c? = —b? which is per- 
missible. Then we deduce from (24) the relation 


bia? + ay? = —a?b? 


and finally get the canonical form of the equation of the hyperbola: 


2 2 
=e! (27) 


This equation shows that a hyperbola has two symmetry axes 
(its principal axes) and a centre of symmetry (the centre of the hyper- 
bola). Putting y = 0 we get x = +a and putting z = 0 we obtain 

= +ib. Consequently, the z-axis intersects the hyperbola at two 
points (which are the vertices of the hyperbola). This axis is called 
the transverse axis of the hyperbola. The y-axis does not intersect 
the hyperbola and is called its conjugate axis. The constants a and 
b are called the semi-axes of the hyperbola. 
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Besides, equation (27) shows that => 1, that is z must be either 
< —a or >a (see Fig. 84). 

A hyperbola has two asymptotes. We shall demonstrate this 
property restricting ourselves only to the case of the first quadrant 
(which is sufficient because of the symmetry properties of a hyper- 
bola). From (27) it follows that 


A ,. bars 
Ye by Zity aA 


Later on (in particular, in the end of Sec. IV.22) we shall present 
some general rules for investigating expressions of this kind. But 


Fig. 84 


we are not familiar with these rules yet and therefore let us apply 
certain artificial transformations which make it possible to “educe” 


be from V 2 — æ: 


Lyr è=} [e+ (VPT) =t 24-2 (V Fa) 


b —— 
To investigate the behaviour of the second summand — (V 2—a@—2) 


as x increases let us multiply and, simultaneously, divide the sum- 
mand by V 22—a@-+-z. This yields 


b (V22— a2 — 2) (V= + 2) md ab 

a V2-—@+z V2t—a2@+2 
The fraction entering as the second summand into the right- 

hand side unlimitedly approaches zero when the variable point M 

travels into infinity along the hyperbola. Now taking the straight 


b b 
Yny = SF mae 
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line Yar at x we see that the difference 6 = Ysp — Ynp wli- 


mitedly approaches zero as x —> œ and therefore this straight line 
is an asymptote of the hyperbola. Taking into account the symmetry 
we finally obtain the equations of the asymptotes: 


b 
YE ik 

It can be shown that the section of the surface of a right circular 
cone (infinitely prolonged in both directions) by a plane which forms 
an angle with the axis of the cone is a hyperbola 
provided the angle is less than the one formed by 
the axis and the slant side of the cone (see Fig. 85). 
Try to prove this assertion reasoning as in the end 
of Sec. 10. 

The properties of a hyperbola can be illustrated 
by the following example. Suppose a sound signal 
is issued from a point A. Let this signal be 
received at two points B and C. Suppose the signal 
is received t sec earlier at B than at C. Then we can 
guarantee that the point A lies on a part of a 
hyperbola (the nearest to the point B) having its 
focuses at B and C and the transverse semi- 


Fig. 85 axis #7 where Vs is the speed of sound (let the 


reader explain this fact). If two experiments of 
this kind are carried out the position of the point A is determined 
as the point of intersection of the corresponding hyperbolas. 

12.4 Relationship Between Ellipse, Hyperbola and Parabola. There 
is aclose relationship between an ellipse (see Sec. 10), a hyperbola 
(see Sec. 41) and a parabola (see Sec. 1.23). This can be accounted 
for by the fact that all these curves are algebraic curves of the second 
order and, as it will be shown in Sec. 13, there are no other curves 
of the second order except for some singular cases similar to those 
discussed in Sec. 8. There are many problems involving parameters 
whose solution is one of these curves depending on the values of 
the parameters. In such circumstances the parabola (or its degenera- 
ted forms) usually occupies an intermediate position between the 
ellipse and the hyperbola. 

Let us consider the intersection of a right circular cone (depicted 
in Fig. 86) with a plane which turns about an axis drawn perpen- 
dicularly to the axis of the cone (for example, about the axis pp). 
When the slope is slight (we mean the angle between this plane 
and the plane perpendicular to the axis of the cone) we have an 
elliptic section. The ellipse elongates as the slope increases and its 
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eccentricity also increases. When the plane intersects both parts 
of the double cone we have a hyperbolic section. Besides, we see 
that in the intermediate position when the plane is parallel to the 
slant side of the cone we have an intersection line which is infinite 
but still consists of one component. There will be no singular cases 
here similar to those indicated in Sec. 8 and, besides, a degeneration 
of an algebraic curve of the second order cannot yield a curve of a 
higher order. Hence the intersection line is a parabola. On the basis 
of these properties an ellipse, a hyperbola and a parabola are called 
conic sections. 

Let us now apply the same point of view to the discussion of the 
simplest equation of an algebraic curve of the second order in polar 
coordinates. We begin with the polar equation of an ellipse. Let 
us put the pole O at the right focus (see Fig. 87). Applying the cosine 
law to the triangle AMO where A is 
the left focus and M is a variable 
point of the ellipse we obtain 


AM? = AO? + OM? — 2A0 x 
x OM cos (180° — 9); 
(2a — p)? = (2c)? + ° + 
+2-2cp-cos Q9; 
4a? — 4ap + 9? = 4 (a? — b’) + 
+ p? + 4cp-cos ọ 
which implies A 


= o nE ene bee 
~ apccoso | (S) EE 


Denoting = p for brevity (p is 


called the focal parameter of the 
ellipse) we finally deduce the equa- 
tion 


p (28) 


EAOn Jern 
P= TFecosp g, 
(In case the pole is placed at the Fig. 86 


left focus we shall have the expres- 

sion 1 — e cos q in the denominator.) If now we take a hyperbola 
and place the pole at its left focus (see Fig. 88) then after similar 
transformations (which we leave to the reader) we arrive at the 


same formula (28). If we then denote £ = pand = e the equa- 


tion (29) will be obtained again. But in this case the dimensionless 
quantity s (also called the eccentricity of the hyperbola) should be >14. 
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It is easy to verify that if we take equation (29) with e = 1 and 
pass from the polar coordinates to the Cartesian coordinates accor- 
ding to formulas (5) we obtain the equation of a parabola. In fact, 


a) p aC. å 2 x a EE 
EEE and p + pcosọ =p which implies Vz? + y? + 


+2=p, V2 +y =p—z and z? + y? = p? — 2px + z?. The- 
refore, finally, 


AE p 
(see Sec. 1.23). Thus, equation (29) represents a parabola for e = 1. 
7 
P 
0 x 
Fig. 87 Fig. 88 


The pole of the polar coordinates which was introduced above 
is called the focus of the parabola. We see that a parabola, in con- 
trast to an ellipse or a hyperbola, has only one focus. Now recall 
an interesting fact shown in Fig. 79: in case e = 1 an ellipse turns 
into a line segment. Thus, the degeneration may yield different 
results in different problems. 

Equation (29) is applied, in particular, to the problem of motion 
of two bodies under their mutual Newtonian attraction which is 
known as the problem of two bodies in celestial mechanics. Let us 
consider, for example, launching an artificial satellite of the earth 
from a point T (lying outside the earth’s atmosphere) in the hori- 
zontal direction (see Fig. 89). If the initial velocity vo is not suff- 
cient the satellite will not rotate round the earth. When the “first 
cosmic velocity” is achieved the satellite will rotate round the 
earth in a circular orbit with centre at the centre of the earth. If 
the velocity vo is then increased the rotation will be in an elliptic 
orbit and the centre of the earth will be at one of the focuses of the 
ellipse. 

The further increase of the velocity vo makes the eccentricity of 
the ellipse increase and the second focus of the ellipse moves off 
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the first one. After the “second cosmic velocity” (the escape velocity) 
has been achieved the trajectory becomes parabolic and the satellite 
will not return to the point 7. Therefore a parabola may be regarded 
as an ellipse with one of its focuses removed into infinity. The further 
increase of the velocity makes the trajectory turn into a hyperbola 
and thus the second focus appears again but this time “on the other 
side”. The centre of the earth remains at one of the focuses of the 
orbit all the time. 


Fig. 89 Fig. 90 


There is one more way of defining conic sections. Rewrite equation 
(29) in the form 


p + pe cos p =p or p = p — pe cos p = e (2 — p cos o) 


Now we see that the expression in the parentheses obtained above 
is just equal to the length of the line segment MM’ shown in Fig. 90 
(verify this!) where the straight line Il is drawn perpendicularly 


to the polar axis at the distance £ from the pole. But p = OM, 
that is we obtain OM = eMM’ which implies aur =e = const. 
Thus, an ellipse, a hyperbola and a parabola can be defined in a 
new way as the locus of all points in a plane for which the ratio of 
their distances from a certain point (which is a focus) to their distances 
from a certain straight line (the so-called directrix) is a constant 
quantity. 

13. General Equation of a Curve of the Second Order. Let us put 
down the general form of an equation of an algebraic curve of the 
second order [as an analogue to equation (18)]: 


Az? + 2Bry + Cy? + Dx + Ey+F =0 (30) 
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{we write 2B instead of B because, as it will be seen, this simplifies 
some formulas which will be further obtained). Now our aim is to 
transform the Cartesian coordinates (see Sec. 2) in such a way that 
equation (30) should take the simplest form; this will enable us to 
find out what. curve is determined by the equation. 

The canonical equation containing no terms with the product of 
the coordinates, we first of all try to turn the coordinate axes in 
such a manner that this product should be eliminated. According 
to formulas (4), after the axes are turned through an angle œ, the 
equation in the new coordinates will have the form 


A (z' cosa — y' sin a)? + 2B (x' cosa — y’ sina) (x’ sina + 
+ y' cosa) + C (x' sina + y’ cosa)? + D (z’ cosa — y’ sina) + 
+ E (x' sina + y' cosa) + F=0 (81) 


Removing the parentheses and collecting the terms containing the 
product «’y’ we see that the coefficient in z’y’ is equal to 


—2A cos a sin a + 2B cos? a — 2B sin? a + 2C sina cos a = 
= 2B cos 2a + (C — A) sin 2a 
Thus, we must have 
2B cos 2a + (C — A) sin2a = 0, i. tan2a =, (32) 


Now we find the angle œ from equation (32) and thus determine 
through what angle the coordinate axes should be turned. 
After the axes are turned through the angle œ the equation takes 
the form 
Atr + Cly® + D'z' + Ely’ + F =0 (33) 


where A’, C’, D’ and E’ are some coefficients which can be found 
by collecting similar terms in (31). But, as it will be proved in 
Sec. X1I.11, a rotation of the coordinate axes does not change the 
quantity AC — B? although the coefficients A, B and C in terms 
of the second degree may vary, that is the expression is an inva- 
riant. There is no term with z’y’ in equation (33) and therefore, 
since B’ = 0, we obtain 


'A'C' — B”? = A'C = AC — B? 


Consequently, if the expression AC — B? written for original 
equation (30) is positive then the coefficients A’ and C’ in equation 
(33) have the same signs since their product is positive. Such a case 
is called elliptic. If AC — B? < 0 then A’ and C’ are of opposite 
signs (this is a hyperbolic case). Finally, if AC — B? = 0 then one 
of the coefficients (A’ or C’) is equal to zero (the so-called parabolic 
ease). 


vo 
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Let us turn to the elliptic case. Completing the square in equation 
(33) (compare with Sec. 4) we arrive at an equation of the form 


A’ (x’ —a)?+C’ (y’ — db)? + F’ =0 
Let us now translate the axes 2’ and y’ (see Sec. 2) and pass to the 
new coordinates z” = 2’ — a and y” = y' — b. Then the equation 
turns into 


A'z"? 4-C'y"= —F', that is ae + ies =1 (84) 
qa | otitis 


For the sake of definiteness let us suppose that both A’ and C’ 
are positive. Then if 7’ <0 we obtain the canonical equation of 
an ellipse. Consequently, the 
original curve is an ellipse 
but displaced and turned with 
respect to the axes v and y. 
If F’ >0 or F’ = 0 we obtain 
the singular cases mentioned 
in Sec. 8 since there will be, 
respectively, either an imagi- 
nary curve or a single point. 

Similarly, in the hyperbolic 
case the curve must be a hy- 
perbola and in the parabolic 
case the curve must be a pa- 
rabola with the exception of 
the singular cases described in 
Sec. 8 which, as a rule, are Fig. 91 
of no significance. 

As an example, let as again consider the graph of the function 


y= £ which expresses inverse variation (see Sec. 1.25). The equa- 
tion can be rewritten as zy — k = 0. Comparing with equation (30) 
we see that in this case A = C = D = E = 0, B = Zand F = —k. 


Since AC — B? = — 4 < 0 we have a hyperbolic case. Formula (32) 


now implies tan 2% = = = +o and this means that we can put 


20 = a ora = =. Let us rotate the axes through the angle of 45°. 
According to formulas (4) we have 


, 2 r 2 2 {2 , 
t=2 Ay MEET ag ue 

B 2 , V2 2 r r 
v=x Wty E A atty) 
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Substituting the expressions into the original equation we obtain: 
2 2 
VB ey) Be ty) —k=0 
i.e. 


Consequently, the curve i in question is a hyperbola which has equal 
semi-axes: a = b = | 2k (see Fig. 91). The fact we have established 
here accounts for the usage of the term “hyperbola” in Sec. 1.253 


CHAPTER III 


Limit. Continuity 


§ 1. Infinitesimal and Infinitely Large Variables 


1. Infinitesimal Variables. Infinitesimal variables form a very 
important class of variables which is of great significance in higher 
mathematics. A variable changing in a certain process is called infi- 
nitesimal if in this process it approaches (tends to) zero unlimitedly. 
For instance, let us consider the process of expansion of a given 
mass of gas. If the gas expands unlimitedly then its density and 
pressure are infinitesimals; this is an example of positive, conti- 
nuous and monotonic (see Sec. 1.5) infinitesimal variables. In the 
process of damped oscillations of a pendulum its angle of deviation 
from the equilibrium position also represents an infinitesimal variable 
as time increases but this variable is an oscillating one and assumes 
both positive and negative values (and the zero value as well). If 


we take thie sequence a; = -h= -4, a=- ... then 
its general term a, = — t is a discrete and negative infinitesimal 


variable in the process in which the number n increases: n = 
= 1, 2, 3, .... It should be noted that when a variable is quali- 
fied as an infinitesimal one we must point out a certain process in 
which the variable changes since the same variable may not be 
an infinitesimal at all in some other process. 

As it has been stated an infinitesimal variable œ “approaches 
zero unlimitedly”. Let us discuss this term in detail. Suppose a 
variable œ changes in some process and tends to zero. Then there 
must exist a moment from which on we shall necessarily have | a |< 
< 1; similarly, there must exist some other (later) moment from 
which on |œ |< 0.4. By the same reasoning, there must exist 
the third moment (following the first two moments) from which 
on we shall always have | a |< 0.04 and so on. This can be expres- 
sed in the following manner: for any e >0 there must exist a certain 
moment in the development of the process from which on there 
will always be | a |< e. It is not necessary to indicate such a mo- 
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ment virtually in all cases: we should only be sure of the very exis- 
tence of the moment. Hence, an infinitesimal variable may not be 
small at all at the beginning of the process in which it changes and 
the essential fact is that it becomes arbitrarily small (in its absolute 
value) when the process goes on sufficiently long. 

In addition, let us now state more precisely what the expression 
“a moment” in the development of a process means. If we consider 
a process developing in time then “a moment” simply means a 
certain moment of time. The realization of a process may not be 
connected with the time but may be related to some other varia le 
(for instance, in the third of the above examples the process is ci- 
racterized by the change of the number n which assumes the succes- 
sive values 1, 2, 3, . . .); in such cases the existence of “a moment” 
means that a variable which characterizes the process takes on 
a certain value. 

Taking into account the specifications given in the last two para- 
graphs we can say that the general term a, of a sequence is an infi- 
nitesimal variable in the process which is characterized by the 
increase of the number n if for any arbitrarily chosen e >0 it is 
possible to indicate a number N = N (e) such that there will be 
|a, |< £ as n becomes larger than N (n >N). 

The notion of an infinitesimal variable can be specified in an 
analogous manner for other types of variables and processes but 
we shall not use them further. 

From the point of view of the definition a constant value, even 
a very small one, is not an infinitesimal variable; only the constant 
value equal to zero is an infinitesimal variable fromthe formal 
point of view of the above definition. 

Now we must point out that the definition of an infinitesimal 
variable which we shall use here involves the following principal 
difficulty: there are no real variables which may approach zero 
unlimitedly. Indeed, in the examples considered above the gas 
cannot expand unlimitedly and the real pendulum stops oscillating 

_ after some time has passed. Furthermore, if we take into account 
the molecular structure of a substance we see that we cannot take 
an infinitely small mass of the substance because it is impossible 
to consider the mass of the substance which is smaller than the mass 
of its molecule; similar situations are observed in other examples. 

Thus, our definition applies only to a mathematical model of 
a real process in which the real situation is simplified so that the 
application should become possible. Therefore we speak about a 
pendulum which has oscillations lasting infinitely long and about 
a “continuous” (non-molecular) structure of a substance and so forth. 
We always replace a real process by its mathematical model but 
this must be performed in such a way that the main features of the 
process we are interested in should not be considerably changed. 
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But nevertheless we always deal with a model and we must not forget 
it. Otherwise some principal mistakes may occur. For example, one 
can make an attempt to attribute all the properties of a model to 
the reality without sufficient reasons. 

There is another possibility of interpreting the usage of the notion 
of an infinitesimal in practical applications. Let us discuss it here. 
The practical (“physical”) infinity should be distinguished from 
the mathematical concept of infinity. Thus, a “practical” infinite- 
simal is a variable or even a constant which is sufficiently small in 
comparison with “finite” values which are involved in a certain 
investigation, i.e. so small that it should be possible to apply to 
it all the properties of “mathematical” infinitesimals without any 
considerable error. At the same time this value must not be too 
small, that is so small that it should be necessary to take into ac- 
count some effects of the microstructure when it is inexpedient, 
or so small that it should not comply with the real possible values. 
For example, suppose we are studying the deformations of an elastic 
body; then certain sizes which are sufficiently small in comparison 
with the size of the body and, at the same time, sufficiently large 
in comparison with the molecular sizes may be regarded as infi- 
nitesimals etc. 

In what follows we shall use the definition given in the beginning 
of this section but from time to time we shall come back to the 
considerations discussed here. 

2. Properties of Infinitesimals. The properties of infinitesimals 
are directly implied by the definition given in Sec. 1. 

1. The sum or the difference of two infinitesimals is also an infi- 
nitesimal variable. Indeed, if each summand approaches zero then 
the sum does the same. Similarly, the sum of three, ten and, in 
general, of an arbitrary finite number of infinitesimals is also an 
infinitesimal. We point out here that there are some circumstances 
when in the development of a process the number of summands ente- 
ring into the sum increases infinitely; then even if each summand 
is an infinitesimal variable the whole sum may not be infinitesimal. 
For instance, 


n times 


ORTE 3 
Ži. S3 


n times 


4 1 al enya 


n times 
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Here we have the situation described above when n increases; but 
the first sum is an infinitesimal variable, the second sum is constant 
and the third sum even increases unlimitedly. 

2. The product of an infinitesimal variable by a bounded variable 
(see Sec. 1.5) is again an infinitesimal variable. Let, for example, 
the first factor be always in the limits from 0 to 1000 and let the 
second factor assume the successive values 1, 0.1, 0.01, 0.001 and 
so on. Then the values of the product will be, respectively, smaller 
than 1000 x 1 = 1000, 100, 10, 1, 0.1, 0.01, 0.001 and so forth. 

In particular, this property implies that the product of an infi- 
nitesimal by a constant is an infinitesimal variable. The product of 
two infinitesimals is an infinitesimal since an infinitesimal variable 
is, of course, a special case of a bounded variable. Similarly, the 
product of any arbitrary number of infinitesimals is an infinitesimal 
variable. 

We remark that the ratio of two infinitesimals may not be an infi- 


nitesimal. If, for example, a = 4; p= = and y = £ + g where 


n takes successive values 1, 2, 3, ... then the variables «, B and 
y are infinitesimals. But at the same time the first of the ratios 
z = ae =1+ = and % =n is an infinitesimal while the 
second approaches 1 and the third even increases unlimitedly. We 
shall discuss such ratios in detail in § 3. 

3. Infinitely Large Variables. A variable z is called infinitely large 
in some process of changing if it increases in this process in its absolute 
value unlimitedly; then we write |x |—— oo. An infinitely large 
variable may be positive and then we write x —— +00 or negative 
(then we write z — —co) but it may also change its sign: for in- 
stance, the variable zn = (—2)" assumes the values —2, 4, —8, 
16,.... as the number n increases and therefore it is infinitely 
large but we cannot say that £, — oo or tn —> —oo. The compre- 
hensive statement of the notion “increases infinitely” is analogous 
to the one given in Sec. 1 for the notion of “infinitely approaches 
zero” but, of course, here we should consider inequalities of the 
form |z | >N. This means that from a certain moment on the 
variable must satisfy the inequality | z | >41 and from some other 
(later) moment on it must satisfy the inequality | z | > 10. Further, 
there must exist a certain moment from which on we shall have 
| z | >100 and so on. 

Now let us discuss some simple properties of infinitely large 
variables. A variable which is the inverse of an infinitely large variable 
is an infinitesimal and, conversely, the inverse of an infinitesimal vari- 
able is an infinitely large variable. These properties can be condi- 
tionally expressed as 


+=0 and 


1 


oe 


LIMIT. CONTINUITY 113 


We shall use this notation but one must understand it correctly. 
For example, the first of the properties means that if the variable 


x entering into the equality z = @ increases unlimitedly then in 


the same process the variable œ approaches zero (or, as in Sec. 4, 
if z is a “practical” infinitely large variable then a is “practically” 
infinitesimal). All formulas containing the symbol of infinity oo 
should be understood in a similar way. For instance, the formula 


tan z = +o is a conditional and abbreviated form of expressing 


the fact that when the variable ọ approaches 3 in some process then 


the variable x = tan @ increases unlimitedly in its absolute value, 
that is x is infinitely large and so on. This enables us to operate 
on the symbol oo as if it were a usual number in many cases but, 
of course, co is not a concrete number but only a symbol indicating 
infinitely large variables which are different in different circum- 
stances. : 

The sum of an infinitely large variable and a bounded variable is 
infinitely large since the first summand “gains over”. The sum of two 
infinitely large variables of a similar sign is also infinitely large. 
In contradistinction to it the sum of two infinitely large variables 
of opposite signs may not be infinitely large since these infinitely 
large variables may “compensate” mutually. These facts are written 
as oo + co = œ; the expression co — co denotes an indetermi- 
nate form. This shows that it is impossible to operate on the symbol 
co as on a usual number in all cases; not always co — oo = 0 since 
oo — oo is an abbreviated and conditional way of denoting a diffe- 
rence of the form X — Y where X and Y are infinitely large variab- 
les. The behaviour of these infinities may vary in different cases 
and therefore it is impossible to have a good judgment on the be- 
haviour of their difference unless an additional investigation has 
been carried out. We shall discuss indeterminate forms of various 
types in detail in our course later on. 

The product of two infinitely large variables is an infinitely 
large variable. Moreover, the product of an infinitely large variable 
by a variable which is larger than a positive constant in its absolute 
value is an infinitely large variable. At the same time, the ratio 
of two infinitely large quantities is an indeterminate form like the 
ratio of two infinitesimals. 


§ 2. Limits 


4. Definition. It is said that a variable x approaches (tends to) a 
finite limit a in some process if a is constant and x approaches a un- 
limitedly in this process. Then we write 


z>a or limz=a 
8—0141 
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Thus, the limit of a variable in case it exists is a constant value. 

According to the above definition infinitesimals are variables 
that approach zero, that is having zero as a limit. But infinitely 
large variables, of course, have no finite limit. 

To say “x approaches a unlimitedly” is to say “the difference 
between x and a approaches zero unlimitedly”, that isz—a=@ 
is an infinitesimal. The last equality may be rewritten as 


z=a+a 


where œ is the infinitesimal. 

If a variable x approaches its limit a but always remains smaller 
than a, that is approaches a from the region of smaller values, then 
we conditionally write z ——a— 0 or limz =a — 0 (this is a 
conditional way of denoting the limit since if we understand the 


I 2 I Hf 
Sp mol ee Aas CaN SSeS eee 
x a x a x x g a r 
Fig. 92 


expression a — 0 as a real difference then a — 0 = a). If x in its 
process of approaching a always remains larger than a then we write 
z—>a-+ 0. Finally, z may tend to a in such a way that it could 
take on values larger than a and values smaller than a all the time 
(such a process resembles damped 
oscillations). All the cases described 
—— here are depicted in Fig. 92. 

xp a g x Now we can sum up our discussion 
on the types of variables. A variable z 
Fig. 93 may be of one of the following types in 

a certain process: 

(4) z is bounded and has a limit; in a special case when the limit 
is equal to zero x is an infinitesimal variable. To distinguish between 
these cases a bounded variable is sometimes called finite only if 
it is not an infinitesimal; for instance, it is possible to speak about 
an infinitesimal mass and about a finite mass etc. 

(2) x is bounded but has no limit; as an example we may consider 
the deviation of a pendulum from its equilibrium position in the 
case of undamped oscillations. Variables of this type are called 
oscillating (see Fig. 93). 4 

In Fig. 93 we see the point æ which possesses the following pro- 
perty: the variable z approaches the point œ infinitely many times 
in the process of its change and takes on the values that are arbi- 
trarily close to & but at the same time 7 does not remain near œ 
all the time. In this case œ is called a limit point of the variable z. 
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There are the greatest value g and the least value p among these 
limit points. g and p are denoted, respectively, as lim z and lim x 
and are called the limit superior and the limit inferior of the variable 
x. But in this case z does not have a “unique” limit which we dis- 
cussed in the beginning of this section. Therefore we should remark 
that the everyday notion of a “limit” (in the sense of some kind 
of a “border”, “frontier”) differs from the mathematical one. 

A bounded variable z always has the limit superior and the limit 
inferior and lim z < lim z. The “unique” limit, that is the limit 
in the sense of our previous definition, exists if and only if lim z = 
= lim z. 

(3) z is unbounded and besides infinitely large. In this case we 
write lim z = +00, and z is said to have an infinite limit. 


Fig. 94 


(4) z is unbounded but not infinitely large. The deviation of an 
oscillating body in the case of a resonance may serve as an example 
here. Such a variable oscillates and from time to time travels “to- 
ward infinity” further and further but at the same time it permanent- 
ly returns to regions lying near the initial point (see Fig. 94). 

5. Properties of Limits. 

1. If a variable has a limit then this limit is unique, i.e. there 
exist no other limits of the variable (the property is obviously im- 
plied by the definition). 

2. ni limit of a constant equals the constant (the property is ob- 
vious). 

3. If z+ a and y — b in one and the same process then z + y > 
—> a + b. This can be written in a different manner as 


lim (x + y) = lim z + lim y (4) 


and formulated as follows: the limit of a sum is equal to the sum of 
the limits. To prove this let us write z = a + æ and y =b + ĝ 
where « and f are infinitesimals. Then z + y = (a + b) + (æ + B). 
Hence, the variable z + y is represented as a sum of the constant 
a + band the infinitesimal a + f (see Sec. 2). Therefore, (£x + y) > 
+> (a + b). 

The result thus obtained may be interpreted “practically” as the 
following example: if 3.002 ~ 3 and 2.001 ~ 2 then 5.003 ~ 5 


8* 
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4. The limit of a product equals the product of the limits. A more 
complete statement is the following: if the factors entering in a pro- 
duct have limits then the whole product also has a limit which is equal 
to the product of the limits of the factors. Indeed, using our previous 
notation we have zy = ab + (aß + ba + af) > ab, that is 


lim (zy) = lim z lim y (2) 


Here, as in property 3, we have taken only two variables but 
it is easy to verify that these properties remain true for any arbit- 
rary finite and constant number of variables. For example, 

lim (zyz) = lim [(cy) 2] = lim (zy) lim z = lim a lim y lim z 

5. A constant factor may be taken outside the sign of the limit, that 
is lim (Cz) = C lim x where C = const. This property follows from 
properties 2 and 4. 

6. The limit of a ratio is equal to the ratio of the limits, i.e. 

a z lim z 

lim (5) ~ limy (3) 
with the exception of those cases when both the numerator and the 
denominator tend to zero, that is when we have the indeterminate 


0 
form T’ 


To prove this we first suppose that lim y = b =Æ 0. Then 


_ atu a ata a\_a ab—fa 

ni =$+(5 coment 

The numerator of the last fraction is infinitesimal whereas the 
denominator ~ b? = const 0 and therefore the whole fraction 
is infinitesimal while the first fraction is constant. This implies our 
assertion. : 


In case limy =0 and limz~0O we have + <tr> 4+0 


(see Sec. 3). Therefore we obtain +00 on either side of formula (3). 
By Sec. 3, formulas (1), (2) and (3) hold not only for finite limits 
but also for infinite ones with the exception of those cases when 


there are indeterminate forms of types oo — oo, 0+oo and z on 


the right-hand sides. These forms will be discussed in Secs. I11.3 
and IV.4. 

7. If z — a and a > 0 then x becomes and remains greater than 
zero (x > 0) as the process of its change lasts sufficiently long (i.e. 
from some moment on). This obviously follows from the definition 
of a limit. A 

8. It is permissible to pass to the limit in an inequality: if £ < y 
then lim z < lim y (naturally, if these limits exist). Indeed, let 
us denote z = y — z. Then z > 0 and therefore z cannot approach 


LIMIT. CONTINUITY 117 


a constant which is negative. Hence, lim z > 0, lim (y — z) > 0 
and lim y — lim z œ> 0. 

We remark here that if z < y then after the passage to the limit 
we can obtain either lim z < lim y or lim z = lim y because the 
difference between z and y may tend to zero. Thus, we cannot retain 
the strict inequality. unless an additional investigation has been 
carried out. 

9. If z<y<z and we have ta and z —a in one and the 
same process then y — a (see Fig. 95). j 

10. If a variable x increases monotonically then it either increa- 
ses unlimitedly, i.e. c—-+oo, or is bounded and then has a finite 
limit: +a — 0 < œ. If, in addition, x has an upper bound A 


a ee ae le 5 


= X a 


Fig. 95 Fig. 96 


(i.e. z <A) then limz—=a<A. A monotonically decreasing 
variable behaves similarly. These obvious assertions (see Fig. 96) 
are indeed the expression of an essential property of the “complete- 
ness” of the totality of all real numbers. If we used only rational 
numbers all the preceding properties of limits would remain true 
with the exception of property 10 since rational values may 
lead to an irrational result when passing to the limit. The rigorous 
justification of property 10 may be found, for example, in [14]. 

Thus, a bounded monotonic variable must have a finite limit; a 
bounded but non-monotonic variable may have no limit (see Sec. 4). 

To conclude the section we point out that the dimension of a 
variable quantity remains the same when we pass to the limit: 
if « > a then [z] = [a]. 

The first attempt to create the theory of limits was made by 
Newton in 1686 but in fact the operation of passing to the limit 
had been used earlier beginning with Greek scientists. The notion 
of a limit which is close to the one used in this book was formulated 
in 1765 by J. D'Alembert (1717-1783), a French mathematician, 
philosopher and enlightener of the pre-revolutionary period 
in France. 

6. Sum of a Numerical Series. The idea of a limit is directly 
applicable to an important notion of a sum of a series. As a preli- 
minary let us introduce the following abbreviated notation: 


q 
Ap + apt apt +++ + agi + u= > ar (4) 
=p 
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a 
Here >) is the summation sign which indicates that we should 


k=p 
substitute k = p, p+ 1, ... q into the expression following 
the sign and then sum up the results (J; is the Greek letter sigma), 
an is the general term, or general element, or kth term of the sum 
(of the series), Æ is the number of the term (the index of summation), 
p and qare, respectively, the lower and the upper limits of summation 
showing the range of the index k. For example, 


ed 1 1 1 4 1 1 
Lane tat ga ge tae + ge (= 0.2774) 


We should point out immediately that the sum does not depend 
on the notation of the index of summation, that is 


q q g 
Š a= D> a aj — 
k=p =p J=P 


Virtually, all these sums are equal to the left-hand side of (4). Thus, 
the summation index isa dummy index, that is it does not enter in 
the result and may be denoted by any letter. 

Now we turn to “infinite sums” or, more precisely, to the notion 
of a numerical series. A numerical series is an infinite expression 
of the form 


utat.. Hant... = 3 ar (5) 
and the summands aj, as, a3, ... are certain numbers called the 
terms of the series. To define the sum of series (5) it is necessary 
first to compose the so-called “partial sums” of series (5): 


n 
Sy=ay, Sp=a,+a; Sg=ay+a,+a3; ...; Sn= È) ans tae 


If the nth partial sum tends to a certain finite limit as the number 
n increases then series (5) is called convergent and its sum S is 
understood as 


o0 n 
S= Y an= lim Sn= lim >) ar 

k=1 n-+o0 noo k=1 
(The inscriptions in small type under the sign lim and, in other 
cases, under the sign —> indicate the process in which the limits 
are considered.) Partial sums of a convergent series which have 
large numbers are practically equal to each other and to the whole 
sum of the series. If there exists no finite limit of partial sums series 
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(5) is called divergent. In particular, if partial sums approach infi- 
nity series (5) is said to diverge to infinity; in this case we write 


J 
> ak = CO (or — oo) 
k=1 


A divergent series has no finite sum. 

The product of an infinite number of factors is determined in a 
similar way. The same manner of reasoning applies to any infinite 
process: first a finite process is performed and then the passage 
to the limit is carried out. 

Let us consider, for example, the series 


PA 4 1 7 
Diep taere tate (6) 
Using the formula for the sum of a geometrical progression we obtain 
1 
Paes 
` 1 4 1 ‘ 3n 
Slim Sa=lim (1-+3+ge+-+++-ger) =lim I Ji 


Thus, series (6) converges and its sum equals 1.5. If we calculate 
the partial sum of the first ten terms we receive approximately 
4.499975. 


In like manner the series 
a+ a+ a +t... +a +t... (7) 


converges for |q | < 1 and its sum (the sum of an infinitely decrea- 
sing geometrical progression) is equal to a (1 —q)7. 
The nth partial sum S, of the series 


44+44+4+4+...4+1+... (8) 
equals n and therefore it tends to infinity. Hence, series (8) diverges 
to infinity. Similarly, —1—1—1—...—1—...=—o, 

Partial sums of the series 
1—1 +1... (1... (9) 


are equal, in succession, to Sa =i, Sa = 0, S;=1, S,=0,... 
and they have neither a finite nor an infinite limit but remain boun- 
ded and oscillate without damping (compare with the end of Sec. 4, 
Case 2). Thus, series (9) diverges in an “oscillating” manner. 

If we drop or add a term in a series (5) this will not affect the very 
fact of its convergence or divergence, that is if series (5) converged 
before, it will converge now though its sum may change and if series 
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(5) did not converge it will diverge after the operation. Indeed, if 
we take the series a, + as +... +a, +... and consider it 
together with (5) then its partial sums will differ from the corres- 
ponding partial sums of (5) in the constant number a, and therefore 
if one of the sums approaches a limit the other does the same. Repea- 
ting such droppings or additions of a finite number of terms of series 
(5) we come to the conclusion that an arbitrary change of a finite 
number of terms of series (5) does not affect its convergence or di- 
vergence. 

If series (5) converges then the series Rn = Anat + ango + 
+ an3 +... also converges (why is it so?). Its sum is called the 
remainder (remainder term, remainder after n terms, or “tail”) of 

oo n co 
series (5). It is clear that S = 2 ah = 2 ar ahi >) = Sa Rea 
= n+ 
It follows that the remainder of a convergent series tends to zero 
as the number n increases since it is the difference between the nth 
partial sum of the series and the limit of the partial sum. 

We shall now establish the necessary condition (test) for the con- 
vergence of series (5). Since S,4=a,t+a,+... + an and 
S,=a@+a,+...+a,4+4a, we have a, = Sn — S,-, and 
therefore 


am ——> S—S=0 (10) 


if series (5) converges. Thus, the general term an of series (5) tends 
to zero as the number increases. This condition is not at all sufficient 
for the convergence; for example, the condition is fulfilled for the 
series 


41 4 4 
Altre ito taete 


but the series diverges to infinity. The divergence is implied by the 
fact that 


$ 4 1 4 4 1 
a E RG . Sa Vn Vee” . EY 
n times 


4 1h 
= 1 he 


We shall systematically study series of different types in Chapter 
XVII where, in particular, some rigorous sufficient conditions 
(tests) for their convergence will be formulated and proved. Yet 
we are going to make use of some series before that as the question 
on their convergence may be settled in some practical sense, although 
not rigorously enough, in the following way. We compute the terms 
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one after another and when we see that soon enough, from some 
moment on, their values become less than the required degree of 
accuracy of calculations and that there is no reason to expect that 
the addition of the following terms may noticeably change the sum 
we simply drop all the subsequent terms. This means that we replace 
the series by a finite number of its terms (i.e. by a partial sum). 
In such a case we may say that the series is “practically conver- 
gent”. 


§ 3. Comparison of Variables 


7. Comparison of Infinitesimals. Comparison of two infinitesimals 
with each other is carried out by investigating their ratio. Let 
a ~0 and 8 £0 be two infinitesimals varying in one and the 
same process. Then we can have the following cases. 


(1) If £ — 0 then f is said to tend to zero faster than œ or that 


P is an infinitesimal of higher order than œ and œ is an infinitesimal 
of lower order than ß. This fact is written in the form | $ | < la] 
or |œ | > |B |; the symbolic equality B = o (a) is also used for 
this purpose. Hence, in this case f is not only an infinitesimal but 
is also an infinitesimal part of the other infinitesimal æ. For example, 
let œ be the volume of an infinitesimal cube and o the volume of 
a right prism with the same base and with the constant altitude a. 
Then |@|<J|o]| since if k denotes the edge of the cube then 


3 
Be ate Oras. 0, 
ah? a 
4 


(2) If 2 o then ppro ie. |] < |p]. 


(8) If the ratio £ approaches a finite nonzero limit then « and 6 
are called infinitesimals of the same order; in this case neither of the 
variables œ and B can become much smaller than the other. In 
particular, in case this limit is equal to unity the variables œ and 
P are called equivalent infinitesimals; then we write œ ~ B. For 
example, the infinitesimals z and z + z? are equivalent as z — 0 
whereas the infinitesimal variables 22 and z + z? are of the same 


: 2 z ; 2x 
order in this process but not equivalent since A 


It is possible to verify the following properties. 

(1) If œ and f are of the same order and |y | < | @ | then | vl< 
< | 6 |. 

(2) If œ and f are of the same order and f and y are also of the 
same order then the same is true for œ and y. 

(3) If œ and ĵ are of the same order and have the same sign then 
% + P is of the same order as æ and.f; in case œ and f have oppo- 
site signs @ + B may happen to be of higher order and so forth. 


o 
o=h, o=ah? and z= 


> 2, 
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For example, let us verify the first property: 


lim fo lim (oar =lim J lim =0-lim 3 = 
which implies what was required. (Verify the remaining properties; 
by the way, they are quite evident.) 


(4) “tine 0 and we denote B — ka = y, i.e. B = ka + y, 
then |y | < |æ |; in other words, infinitesimals of the same order 
are proportional to each other within the accuracy to a term of higher 
order. This follows immediately from 

bid P—ka = B AEE i 


a a 


Such a case of “almost proportional” infinitesimals is sometimes 
denoted as aw f. 

8. Properties of Equivalent Infinitesimals. The following simple 
properties are true: 

(1) if œ ~ P then $ ~ a; 

(2) ifa ~ ĵĝ and P ~ y then a ~ y; 

(3) if a~ B then æ = B + y where | y| < la| (and |y « 
< |P |); in other words, the difference between equivalent infinite- 
simals is an infinitesimal variable of higher order. Conversely, if 
a = P + where |y] < || then a ~ 6 which means that the 
addition of a variable of higher order to an infinitesimal results 
in a variable equivalent to this infinitesimal; 


(4) if a ~ a, and B ~ B, then lim 2% = lim = where z and y 
. > 1 . 
are arbitrary variables or numbers; this means that when calculating 
limits it is allowed to replace infinitesimals occurring in the nume- 
rator and in the denominator by equivalent variables. 


All these properties can be verified in a similar way. For example, 
let us justify the fourth property: 


za _ sa; & By 3 ; K ES: AN 
m OR which implies lim = 
lim 244143 Q. u> Bi —=lim 2%.4. 
= lim war lim ae lim 5 lim 7B 1-1 


and so the fourth property is true. 

9. Important Examples. 

1. The length of an infinitesimal are is equivalent to that of its 
chord, that is ahs —> 1 as N — M (see Fig. 97). The explanation of 
the property lies in the fact that a small arc is so short that it has 
no “room” to “crook” noticeably, i.e. to change its direction. There- 
fore, if we watch these elements “through a microscope” so that 
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they should be magnified up to finite sizes the chord will be practi- 
*) cally indistinguishable from the arc. In more rigorous investigations 
of this obvious property this fact is sometimes regarded as an axiom 


Fig. 97 Fig. 98 Fig. 99 


on which the definition of the length of an arc is based. On the other 
hand, this property is sometimes deduced from other analogous 


axioms. 
2. Applying the above result to an infinitesimal arc of a circle 


(see Fig. 98) we derive 


MN _ 2PN _2Rsinz_sinz > 4 
MN 20N 2Ra i ESTO 


It was meant that z œ> 0 but the expression sie does not change 
when z changes its sign and therefore the sign of z does not matter 
“here. Thus 
lim = =1 (11) 


x0 


Incidentally, we see that sin z < z for  >0 (since MN < MN). 

3. Now let us consider Fig. 99 which repeats Fig. 44 with some 
additional lines. We see that tan B = — . Now if h -> 0 then 
p>a = 45° and tanf —> tana = 1, that is 


In (1A) 4 (12) 


lim h 


h>0 


l In Fig. 99 it is assumed that h > 0 but the same is true for k < 0 
and h > 0. i 
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1 
Here is an important corollary of formula (12). Since (1 + h)* = 
In (41-FA) 
=e bh , we have, as h —> 0, 
In (1-LA) 
BS ae ande ” -—>el=e 


Thus, 
Ee 
lim (1-++h)" =e (13) 
h0 


The last limit is sometimes taken as the definition of the number e. 

Many other limits can be evaluated by means of the results repre- 
sented above. Here we also mention that we shall offer a more stan- 
dard and simple method of evaluating limits in § IV.4. 

10. Orders of Smallness. Let œ and ĝ be two infinitesimals chan- 
ging in one and the same process. If B is of the order of a then ĝ 
is said to be an infinitesimal of the kth order relative to œ. Here 
the speed at which œ approaches zero serves as some standard with 
-which the speed of B (as B tends to zero) is compared. 

Examples. Let z— 0, i.e. let zx be an infinitesimal. We shall 
regard it as a standard. Then if y = 22? we see that y is an infinite- 
simal of the second order (relative to x) since y and 2? are infinite- 
simals of the same order; if z = 423 + 27 then z is an infinitesimal 


of the third order since z and z? are of the same order (4 > 4). 


In general, the sum (or the difference) of infinitesimals of different 
orders is characterized by the lowest order of the infinitesimals. Namely, 
the infinitesimal which has the lowest order of smallness is the 
principal term in such a sum. In other words, all the remaining terms 
are negligibly small relative to the principal one and the sum is 
almost completely exhausted by the principal term when the process 
goes on sufficiently long. Furthermore, if u = Vz — z? then u 


is of the th order and hence wu is an infinitesimal of lower order 


than 2, i.e. = —> œ and |u| |z|. Generally, if the order of 
an infinitesimal is lower than unity the infinitesimal is of lower 
order relative to the standard. Finally, let us take v = 1 — cos z; 
here v is of the second order since 


2sin? = 
z v -  41—cosz : 2 
lim — = lim ——— = im ———> = 
x0 T x0 ak x0 zh 
r Dig z z T 
2sin zy ary 
= lim-—_—_———~ = im 


x0 ah x0 T 
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(in the passage to the last term we have used the fourth pro- 
perty from Sec. 8) and therefore to receive a finite nonzero limit 
we should put k = 2. 

If the standard is changed the order of an infinitesimal may also 
change and therefore it is absolutely necessary to indicate the stan- 
dard variable. For instance, the variable y = 2° is an infinitesimal 
of the sixth order relative to x as z —> 0 but it is only of the second ` 
order relative to 2°. 

11. Comparison of Infinitely Large Variables. The comparison 
of infinitely large variables is carried out in a way similar to that 
of comparison of infinitesimals. But there exists a certain difference 


in the terminology: thus, if >: 0 where z and y are infinitely 


large variables then we say that x is of lower order relative to y 
and y is of higher order relative to x but the notation |æ | < |y | 
or z = o (y) remains (these forms of writing are used in all cases 


when = -> 0 even if variables z and y are neither infinitesimal 
nor infinitely large; we also remark that the notation z = O (y) 
indicates that the ratio is bounded). All the assertions of Sec- 


tions 7, 8 and 10 are transferred with some little changes to infinitely 
large variables. 

To conclude the section we point out an obvious property: if 
lim z = 0, lim y = const s4 0 and lim |z | = œœ then |z | < iyl 
and |y |< lz l. 


§ 4. Continuous and Discontinuous Functions 


12. Definition of a Continuous Function. The definition of a 
continuous function was given in Sec. 1.16. Now we are going to 
discuss it in detail. 

Suppose a function y = f (2) is given. Let its argument first take 
on the value z, and then receive an increment Az, that is let z assume 
a new value z = Tọ + Az (see Sec. 1.22 on this notation). Then the 
function will also receive an increment 


Ay =y — yo = Í (2) — Í (z0) =f (to + Az) — f (£o) (14). 


The function f is called continuous at the point 2p (i.e. for the value 
of the argument equal to £o) if Ay > 0 in a process in which Ar + 0 
or, in other words, if the increment of the function is an infinitesimal 
when the increment of the argument is infinitesimal. Otherwise £o 
is called the point of discontinuity of the function f. 

Since f (z) = f (£o) + Ay [see formula (14)] the condition Ay 0 
is equivalent to the following condition: f (x) — f (£o). Further, 
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writing z —> zo instead of Az — 0 we arrive at the following equiva- 
lent representations of the definition of the continuity: 


f(t) 5 F(@o) or oa f (x) = f (z0) 


and, finally, 
lim f (x) = f (lim 2) (15) 
x> Xo 


This means that the limit of a continuous function at a point equals 
the value which the function assumes when the argument takes on the 
value of its own limit.* 

It should be underlined that the value of a function at a point 
of its continuity cannot be infinite. 

A function which is continuous at each point of an interval is said 
to be continuous over the interval. 

13. Points of Discontinuity. If zọ is a point of discontinuity of 
a function f then the value f (zo) of the function is often undetermined 
but this fact usually does not play any important role. In these cir- 
cumstances the limits of the values of the function f (x) as £ > £o — 0 
and z — zọ + 0 are essentially important (see Sec. 4). These two 
limits are denoted conditionally as f (z) — 0) and f (xo + 0), respec- 
tively (see Fig. 100).** 

It may occur that these limits are finite and f (£o — 0) = f (£o + 0) 
though the value f (zo) may not be defined or it is defined but does 
not coincide with f (xz + 0). Such a discontinuity is called removable 
since if we put f (zo) = f (£o + 0) (this value may be called the 
“true” value of the function f (x) at the point zx = <p) then there will 
no longer exist any discontinuity of f (z). We shall give a simple 
example: let the function f (x) be defined by the formula 


f(a) =e (16) 


* Recalling the specification pointed out in Sec. 1 we can now formulate the 
definition of the continuity at the point zo as follows: for any given e > 0 there 
must exist 6 > 0 such that | z — zo | < ô should imply |f (z) — f (xo) | < €. 
This definition given by A. L. Cauchy (1789-1857), a prominent French mathema- 
tician, is fundamental in books meant for mathematicians. 

Defining the continuity of y =f (z) at =z» as the condition Ay > 0 
for Az — 0.we mean that Ay —> 0 not only for some certain process in which 
Ax — 0 or for several such processes but for all possible processes of this kind. 
Writing (15) we also mean that it holds for all processes in which s — ao. 
Cauchy’s definition takes these facts into account automatically since for every 
process in which z —> zo there exists a moment from which on |s — zo | < 


we f (£o F 0) are called, respectively. the limit of f (x) on the left of the 
point z = a (the left-hand limit) and the limit of f (z) on the right of the point 
x = zo (the right-hand limit).—T7r. 
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The function is continuous for all z= 0 but not defined at z = 0 
since z = 0 cannot be substituted into formula (16) because this 


yields the indeterminate form Z. But if in addition to formula (16) 


we put f (0) = 1 then, by formula (11), the new function f (z) thus 
obtained will be defined and continuous for all z without exceptions. 
Thus, we had the removable discontinuity at the point z = 0. From 


Fig. 100 Fig. 104 


the geometrical point of view this means that the curve pp (see 
Fig. 101) “lacked” one point, i.e. the point M. After the point has 
been added to the curve it becomes continuous. 

If the values f (xp — 0) and f (zp + 0) are finite but f (£o — 0) =£ 
Æ f (£o + 0) then the function f is said to have a point of discon- 
tinuity of the first kind or, which is the same, has a finite jump [the 
term jump is applied to the value A = f (xo + 0) — f (zo — 0)] 
or a jump discontinuity (see Fig. 100). In case at least one of the 
values f (xo — 0) or f (£o + 0) equals infinity we say that the func- 
tion, in a conditional sense, turns into infinity at the point x (or 
we say that the graph of the function “travels into infinity”). For 


“instance, the function f (z) = + behaves in this way at the point 


cys 
In conclusion we remark that in some cases f (£o — 0) or f (£o + 0) 
has neither a finite nor an infinite value since a variable may have 


neither a finite nor an infinite limit. For instance, we have = —> 00. 
as z—> 0 and therefore we see that the function f (x) = sin = will 
infinitely pass from —1 to +4 and back to —4 as x— 0 and thus 


fa) = sin + has neither a right-hand limit nor a left-hand one 


(and, in general, it has no limit) at z = 0 (see Fig. 102). 
Real variable quantities of physical nature have discontinuities 
when a certain kind of action is suddenly applied or switched off, 
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when there is a transition from one medium into another (through 
the interface between the two media), when the law of functional 
relationship between the quantities suddenly changes etc. 

See, for example, Fig. 103. There is a graph of variations of the 
electric current flow i plotted against time ¢ there which corresponds 


Fig. 102 


to the process of transmitting the letter “a” in the Morse code by 
radio (the signal “dot—dash”). Thus Fig. 103 shows the dependence 
of i oni. As is seen, we have here a function with four points of 
discontinuity and at each of these points there is a finite jump cor- 
responding to switching on or switching off the constant emf (electro- 
motive force) in the circuit. It is important to pay attention to the 


Ne 


110% 240°ts20 


Fig. 103 Fig. 104 


fact that if we analysed this phenomenon more carefully (and for this 
purpose took a much larger time scale for the ż-axis) then we should 
see that in reality the growth of the current is similar to the one shown 
in Fig. 104. Since there is always a certain inductance in the circuit 
the current increases continuously (although very fast) and therefore 
in real circumstances there must be no discontinuity of the function 
i (t) at all! In some cases when, for instance, a pulse lasts a very 
short time it may be important to take into account the continuity 
of this transient process (for example, corresponding to the part OA 
of the graph). But when the transient process is of no importance 
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it is simpler to schematize the process and consider the function 
i (à discontinuous according to Fig. 103 if this does not lead to noti- 
ceable mistakes. Thus, one and the same function i (é) of a real 
“physical” nature may be regarded as continuous or discontinuous 
depending on whether we intend to take into account the transient 
process or not. In the case of passage from one medium into another 
an analogous role is played by the processes near the interface bet- 
ween the media which may or may not be taken into account. 
If an elementary function is considered then, as it will be shown 
in Sec. 14, it can have a discontinuity at z = zp if and only if the 
substitution of z = zo into the function yields an expression 


of types=, In 0 or 0° in the expression of f (x) or in some part 


of this expression. G. H. Hardy (1877-1947), an English mathemati- 
cian, showed that in case such a situation takes place the limits 
f (£o — 0) and f (zo + 0), finite or infinite, should exist provided 
the function f is defined, respectively, on the left or on the right 
of 2. Only in case the expressions sin oo and cos oo enter into the 
representation of f (to) there may be an exception to this rule. 

If a function f is defined only on one side of zo, for example, on 
the right, then it may happen that only f (£o + 0) (“the end-point 
value”) exists while f (£o — 0) does not. The limits f (—oo) and 
f (+œ) may also be regarded as end-point values. 

14. Properties of Continuous Functions. 

1. The sum of two continuous functions is a continuous function. 
Virtually, if the functions f, (x) and f, (x) are continuous and f (x) = 
= fi (2) + fe (x) then, as z —> Zo, 


lim f (z) = lim [f, (£) + fa (@)] = lim fı (z) + lim f; (z) = 
= fı (20) + fe (z0) = f (z0) 


which implies the continuity of the function f (z) (see Sec. 412). 
We point out that in the above proof we first used a property of 
limits (see Sec. 5) and then the continuity of the functions f, and fz. 
Similar application of other properties of limits implies the following: 

a sum or a difference or a product of an arbitrary number of con- 
tinuous functions is also a continuous function; 

a ratio of two continuous functions is a function which is continuous 
everywhere except the points where the denominator equals zero. The 
ratio either approaches infinity or becomes an indeterminate form 


of type 3 at those points where the denominator vanishes. There- 


fore the continuity does not hold in either case. 

2. A composite function formed by means of continuous functions 
is a continuous function. Indeed, if the functions z (y) and y (x) are 
continuous and x assumes an infinitesimal increment then, by the 
continuity of the second function, the increment of y will be infi- 


9—0141 
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nitesimal too and therefore the increment of z will be also infinite- 
simal by the continuity of the first function; thus, the composite 
function z (x) will be continuous. 

The first two properties enable us to make the following conclu- 
sion concerning the continuity of elementary functions. Reviewing 
the basic elementary functions (see Sec. 1.18 and § I.4) we see that 
among them only y = z~” has a discontinuity for —m < Oat z = 0 


(in this case the form ; appears), y = log, x has a discontinuity 


at z = 0 (this yields log 0) and y=tan z at x = 42,42, TA 


a n 
sins 
(this results in = = J: When composite functions and algeb- 
coss 
2 


raic combinations are composed of basic functions then, according 
to properties 1 and 2 and Sec. 15 (1), the new points of discontinuity 


may appear if and only if the expressions of the form 5 and 0° 


occur. This proves the assertion stated in the end of Sec. 13. 
Hence, if f (x) is an elementary function theniim f (æ) simply 


xxo 


equals f (£o) provided there are no “dangerous” expressions of the 
form F In 0 and 0° in the expression of f (x); for example, 
In(i-+sinz) In (1+sin 1) 


oreo acs gaa A2] 


= — In 1.8415 = — 0.6105 


= — ln (1+ sin 1) = 


(the values of sin4 and of In 4.8415 are taken from tables). This 
rule of determining a limit remains true for the operations on in- 
finities provided after substituting the limit of the argument the 


A 5 0 A > 
indeterminate forms 5, 0% — ©, 2, 0.œ (which were mentioned 


in Sec. 5), 0°, 4% , œo [which will be discussed in Sec. 15 (1)] and 
expressions of the form sin co do not appear. For example, 


h Ing __ In(+0) i _—o 
B ( = +cosc) ae -+ cos ( He spt t= — o, 
1 1 
lim (2+ 2" $= (0H (0) 0+ o0 =00 
x4 

More on the indeterminate forms see in § 3 and § IV.4. 
3. As it was indicated in Sec. 1.16, the graph of a continuous func- 
tion y = f (x) defined in an interval a < z < b consists of one com- 
ponent. The consideration of such a graph (see, for example, Fig. 105) 
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shows that a continuous function defined over a finite interval (inclu- 
ding its end-points) is bounded and attains its least (in the algebraic 
sense) value (at x = a in Fig. 105) and its greatest (at x = c) value. 


These values are denoted, respectively, as min f (x) and max f (2) 
a<x<b ax<x<b 
(abbreviations of “maximum” and “minimum”). Besides, such a func- 
tion takes on all the intermediate values between f (a) and f (b), each 
value being taken at least one time. For example, the value y = q 
is assumed only one time in Fig. 105, i.e. at x = 5, whereas the value 


y = p is taken three times at the points z = a, B and y. Further, 


Fig. 105 Fig. 106 


if f (x) is positive for some value x = Xo it remains positive at all points 
x ying close enough to Xp. Finally, turning back to Figs. 25 and 26 
we see that if a continuous function is monotonic then its inverse func- 
tion is also continuous (and monotonic). 

Here we shall restrict ourselves to the visual considerations con- 
cerning these properties. Their rigorous proof does not appear simple 
in the general case; this proof can be found, for instance, in [14]. 

It should be pointed out that a continuous function is not necessa- 
rily smooth, that is having a graph with a certain tangent at its 
every point (a smooth function is shown in Fig. 105). On the contrary, 
as it is shown in Fig. 106, it may happen to be piecewise smooth, 
that is to have a broken graph consisting of several smooth arcs. 
Even some more complicated cases are possible but we are not going 
to treat them in our course. Some more detailed comments on this 
question will be given in Sec. IV.3. 

15. Some Applications. 

1. Limits of Composite Exponential Expressions (“Power-Exponen- 
tial” Expressions). Let us consider the expression 2” where x— a, 
y— b and z >0. Let us represent xY in the form zy = (el *)¥= 
=evinx, But lng— lna by the continuity of the logarithmic 
function and hence y Inz— b In a (as a limit of a product). There- 
fore exp (y In z) —> exp (b In a) (by the continuity of the exponen- 
tial function). Thus, z” —> eb !n@ = (eln)> = aè or, in other words, 


lim z¥ = (lim z)limy 
g* 
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and so we see that it is permissible to pass to the limit in the expres- 
sion z¥. The exception to this rule is the case when the product b In a 
becomes an indeterminate form, i.e. it has the form 0-00. This 
may occur if 


b 


Ina=0; b=oo, ie. a=1 and a?=1%; 
b=0, i.e. @=oo and © a’=oo?; 


Ina=—o; b=0, ie. a=0 and a’=0° 


Hence we just have the three types of the indeterminate forms men- 
tioned in Sec. 14. 
One may sometimes think that there must be 41% = 1 since “unity 
to any power equals unity”. But 1% is not at all unity to a certain 
finite power but only the abbrevia- 
y y=f(x) ted notation for a limit of an expres- 
sion of the form z¥ where x— í 
and y — œ. Suppose, for example, 
that z— 1+0, that is x >1. 
Then the expression z” “has a cer- 
tain tendency to approach 41” (since 
1¥ = 1) and at the same time it 
“wants to tend to infinity” (since 
xz >1and z” = o because if we 
raise a constant number larger than 
unity to an infinitely increasing 
power we shall arrive at an infini- 
tely large variable). Therefore, 
these two tendencies “act” upon the expression in the opposite 
directions and hence the result may be different in different prob- 
lems depending on which of these tendencies “wins”. For example, 
in case (13) the limit turned out to be equal to e whereas the imme- 
diate substitution of k = 0 yields 4%. In this case we see that the 
tendencies are equal in a certain sense; they are in a state of “balan- 
ce”. Similar conclusions may be derived in connection with other 
indeterminate forms. 

2. Solving Inequalities. Let a function f (x) be considered on an 
interval (a, b) (in particular, the interval may be the whole z-axis) 
and let it be necessary to solve the inequality 
f (z) >0 (17) 
that is to determine all the values of x for which it holds. In the geo- 
metrical sense this means that we must find such regions on the 
a-axis where the graph of the function y =f (x) lies over the z-axis 
(such regions are shaded in Fig. 107). But it must be stressed that 
in this problem we regard the function f (x) as known whereas its 
graph may be unknown. 


Fig. 107 
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To solve inequality (17) let us mark all the zeros of the function f 
(that is the points where f vanishes) on the interval (a, b) and also 
all the points of discontinuity (there are three zeros and one point 
of discontinuity in Fig. 107). The interval is divided into several 
parts by these points (five parts in Fig. 107). Since we have taken 
into account all the points of discontinuity of the function it is 
continuous inside each of these parts. Besides the function does not 
vanish inside the parts because we have reckoned all its zeros. Thus, 
the function y = f (z) retains its sign inside each of the parts (see 


Fig. 108 


property 3 in Sec. 14). Now in order to determine this sign it is 
sufficient to determine the sign of the function at any point inside 
the considered part of the interval. After this operation we choose 
those parts in which the function is positive and thus the inequality 
(17) will be solved. 

23+ 322 — 


Example. Let us solve the inequality pastas! >0. 
The numerator equals zero when x = 1 and therefore it is divisible 
by z — 1. This implies that the expression 
x3432—4 _ (e—1) (22 +4r +4) _ (e—1) (e +2)? 
F ee a FE ae 3 


Ysa 9 zr 


must be positive for the values of x which we are in terested in. Hence, 


the function is defined over the whole z-axis except z = +V3 and 
has two zeros (x = 4 and z = —2) and two points of discontin- 
uity (z = + V3). These points break the z-axis into five parts (see 
Fig. 108). Now we choose a point in each of the intervals, substitute 
these values into the last fraction and determine the signs of the 
fraction inside the intervals (the numerical values themselves do not 
matter and only their signs are essential). Thus we receive the table 


Thus, the solution of the inequality is a totality consisting of 
two intervals: 


EVS aed and V3<r<0 


CHAPTER IV 


Derivatives, Differentials, 
Investigation of 
the Behaviour of Functions 


$ 1. Derivative 


1. Some Problems Leading to the Concept of a Derivative. We 
come to the notion of a derivative, one of the most important notions 
in mathematics, when investigating the rate of change of a function. 

For example, let us turn to the notion of the velocity (rate) at 
a given instant of a rectilinear motion of a material point. A mate- 
rial point is understood in physics as a material body such that it 
is permissible to neglect its geometrical sizes while investigating 
the state of the body under some concrete conditions. In different 
circumstances a particle of a substance, or an airplane, or a heavenly 


Fig. 109 


body etc. may sometimes be regarded as a point. Let a material 
point move along the s-axis from left to right. In the general case 
the motion may be non-uniform, that is the velocity of the motion 
may be variable. The law of motion is expressed mathematically 
as a dependence of the coordinate s on time t: s = f (t). Since the 
velocity is variable the ratio of the distance passed over to the 
time taken represents the average velocity only. As for the “true” 
velocity, that is the velocity at a given instant, it-can be obtained 
by means of the following procedure. Let the moving point occupy 
the position A (see Fig. 109) at an instant t. Suppose during the period 
of time At (see Sec. 1.22 on this notation) the moving point transits 
to the position B, the distance As being passed over. Then 


s=f (t), s+ As = f (t+ Ad 
ie. As = f (t+ At — f (i. Hence, the ratio vas = 4 (which 
is the distance passed over per unit of time taken) is the average 
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velocity of motion during the time period At from ¢ to t+ At. 
Now the instantaneous velocity of motion at time t is obtained as the 
limit of the average velocity in the process of decreasing the interval 
At unlimitedly, that is 
Vint = lim Vay = lim As _ jim 1+ 40-1 (1) 
At+0 At+0 At+0 At 

It is also said that the instantaneous velocity (that is the velocity 
at a given instant, the true velocity) is the average velocity during 
an infinitesimal interval of time (“element” of time) or that the 
instantaneous velocity is the ratio 
of an infinitesimal distance to an A 4s 
infinitesimal time interval. Both s 
definitions briefly express the mea- 
ning of the general definition (1). 

The rate of a physical process is 0 
not in all cases represented by the 
distance passed over related to the Fig. 110 
unit of the time taken. Let us con- 
sider, for example, the process of filling a vessel. In this case 
the dependence V =f (t) of the volume already filled on time ¢ 
expresses the law of the process of filling. The avera x i 
during the interval of time from ¢ to t + At is 
ratio 


whereas the limit 


‘ ; AV 3 
Winst = lim We = lim AE = lim 
Ato0 At+0 At+0 


serves as the instantaneous rate, i.e. the rate of 
Thus we have arrived at an expression similar to (1 

But we can understand the velocity, the rate, even in a wider 
sense relating the change of a quantity not to the unit of time but 
to the unit of some other quantity. For example, let us consider 
the notion of the linear density of a material line, that is of a body 
such that, under given concrete conditions, it is permissible to take 
into account only its size in one-dimensional extent (the longitudinal 
size) neglecting the cross-section sizes. At the same time we do not 
neglect its mass. If this line (“thread”) is homogeneous its linear 
density is equal to the ratio of its mass to its length. In case the 
thread is non-homogeneous its linear density is different at different 
points. Let us reckon the distance from one of the ends of the thread 
(see Fig. 110) and let the mass of the part of the thread corresponding 
to the distance s be equal to M =f (s). If now some additional 
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distance As is passed the ratio 
_ AM _ f(s+As)—f(s) 
Pav = As = As 


represents the average linear density of the thread corresponding 


to the part AB. The limit 
r >n AM _ 3 f(s+As)—f (s) 

ee ee o 
now gives the linear density of the thread at a point (namely, at 
the point A). We may say that p is the rate (velocity) of change, 
of the mass of the thread, i.e. the change of the mass per unit of 
distance passed. 

2. Definition of Derivative. From the mathematical point of 
view expressions (1), (2) and (3) are quite similar. This enables us 
to state the following definition. Let a function y = f (x) be given. 
Then the rate of its change related to the unit of change of the argu- 
ment z is equal to 

y’= lim AY — lim 

Ax+0 ÖT  Ax-+0 

This rate (velocity) is called the derivative of the variable (func- 
tion) y with respect to the variable (argument) x; in other words, 
the derivative is the limit of the ratio of the increment of the function 
to the increment of the argument taken in the process when the incre- 
ment of the argument approaches zero. Since this rate has, in general, 
different values for different values of x the derivative itself is a 
new function of z. This new function is designated as y’ = f (z). 

Hence, in the examples of Sec. 4 the velocity of motion is equal 
to the derivative of the distance passed with respect to the time, 
i.e. v = s (the subscript ¢ in the expression s, indicates that the 
derivative is taken with respect to the variable ż) etc. 

For example, let us compute the derivative of the function y = az?. 
Increasing the argument by an increment Ax we receive the new 
value of the argument x + Az and the new value of the function 
y + Ay =a (x + Az)? since x + Az should be substituted for x 
into the expression of the function. Thus, 

Ay = a (x + Az)? — az? = 2az Ax + a (Az)? 
This implies 
Ay 2ax Az + a (Ax)? _ 
Az a 


f (x-+ Az) —f (z) 
Az 


‘= lim ——= lim 
4 fo Az rea 
Note that in the latter passage to the limit only Az varied as Ax > 0 
whereas x was considered to be constant. The result thus obtained 
can be written in the form (az?) = 2az. 
We leave it to the reader to verify that (a2*)’ = 3az?, (ax)! = a 
and the like. We particularly note here that xz’ = 1. 


lim (2ax +a Az) = 2ax 
Ax-+0 
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3. Geometrical Meaning of Derivative. Let us consider the graph 
of a function f (x) (see Fig. 111). We see that a = ae = tan B, 
i.e. the ratio is equal to the slope of the secant mm. If Az > 0 then 
the secant turns round the point M and tends to the position of the 


Fig. 114 Fig. 4112 


tangent Jl in the limit process since the tangent occupies the limiting 
position of the secant when the points of intersection merge. (This obvi- 
ous property which we have already used is in fact nothing but the 
definition of a tangent.) Therefore 

= lim 
Yo Ax+0 


AY elim tan B=tan a [u= f (%)] (4) 
that is the geometrical meaning of the derivative of a function is that 
it is equal to the slope of the tangent. By formula (11.21) it is easy 
now to put down the equation of the tangent ll; 


Yy — Yo = Yo (2 — Xo) (5) 

where zo and yo are the coordinates of the point of tangency, x and y 
are the moving coordinates of the point on the tangent straight line. 
Similarly, the equation of the normal to the curve, that is of 
the line perpendicular to the tangent at the point of tangency, has. 
the form y — Yo = — -!- (£ — zo) (see problem 5 in Sec. II.9). 

0 

In Sec. 1.26 we said that the angle between two curves at the point 
of their intersection was defined as the angle between the tangents. 


to the curves at that point, and therefore we are able now to deter- 
mine the angle by means of formula (11.23) since we know how to 
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determine the tangents. Note that the angle may turn out to be 
zero in case these curves are tangent to each other, i.e. when their 
tangents coincide. 

When the graph of a function y = f (x) is given the geometrical 
meaning of the derivative makes it possible to indicate the slope 
of the tangent to the graph and this enables us to draw immediately 
a sketch of the graph of the derivative (see Fig. 112). For more accu- 
rate “graphical computation of the derivative” it is necessary to 
draw tangents to the given graph and measure their slopes. It turns 
out that it is practically simpler to draw normals to the graph by 
means of a shiny (metallic) ruler and to measure their slopes with 


Fig. 113 


respect to the y-axis which is just the same. One of these procedures 
is shown in Fig. 112. We apply the ruler perpendicularly to the 
plane of the graph to one of its points, e.g. to the point M, and 
turn the ruler in such a way that the reflection of the graph in the 
ruler should prolong the graph without a break at M. In this posi- 
tion the ruler will lie exactly along the normal to the graph at M. 
Then we draw a straight line passing through the point (0; 1) and 
parallel to the normal thus constructed. In this way we get the line 
segment OP which is then transferred to the position gq. After a 
number of such procedures are carried out we obtain a rather accurate 
graph of the derivative. 

While discussing the geometrical meaning of the derivative 
{formula (4)] we supposed that both variables z and y were dimen- 
sionless and that the scale was the same for both axes. But this is 
not always the case in practical problems. It follows from formu- 


la (1.10) that in the general case we must write y= tan a. 
Thus in the general case the derivative is also equal to the ‘slope of 
the tangent. 


Note that if the derivative y’ approaches infinity for some value 
of x (for x = z; in Fig. 113) then the tangent at the corresponding 
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point of the graph has the slope equal to infinity, that is the tangent 
line is parallel to the y-axis. If the derivative has a jump discon- 
tinuity at a point then the tangent turns jumpwise, i.e. the graph 
is broken at the point (see the point z = z, in Fig. 113). In case 
the function approaches infinity the derivative may also turn into 
infinity (see the point x = 2, in Fig. 113). 

4. Basie Properties of Derivatives. 

1. The derivative of a constant equals zero. (The property is ob- 
viously interpreted as the fact that the velocity of a body in a state 
of rest is equal to zero.) The formal proof of the property looks as 
follows: if y = C = const then 


Ay=C—C=0, 7 >=9, 
3 Ay ş 

= lim ——= lim 0=0 
y PR Az sone 


2. The derivative of a sum is equal to the sum of the derivatives of 
the summands. Indeed, if y (z) = u (x) + v (x) then y (z + Az) = 
= u (z+ Az) + v (x + Az) and 


Ay = y (x + Az) — y (a) = lu (z + Az) + v (x + Ax)] — 
— [u (x) + v (z)] = lu (z) + Au + v (a) + Av] — 
— [u (x) + v (z)] = Au + Av 


that is the increment of a sum is equal to the sum of the increments} 
A (u + v) = Au + Av. Hence, it follows that 


x Ay .  Au-+Av X Au Av\ 
‘— jim —= lim —— = 1 — +) 
y Ax>0 Az Ax>0 Ax Ax+0 Ax + Ax ) 
Au eet AU 
= lim —-+ lim ——=u'-+v’ 
Ax+0 Ar ARREN Ax 


(in the deduction we have used the fact that the limit of a sum is 
equal to the sum of the limits) which is the required proof. This 
property can be rewritten in a different way as 


lupy =u p 


We have taken a sum of two summands. It is clear that the same 
is true for an arbitrary number of summands. Similarly, the incre- 
ment of a difference is equal to the difference of the increments and 
the derivative of a difference is equal to the difference of the deri- 
vatives. 

Example. (x3 — 32? + x + 5)! = (#5)! — (82%)' + (a) + (5y = 
= 322 — 6 + 1 + 0 = 32? — 6z + 1 (see the end of Sec. 2). 
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3. A constant factor can be taken outside the derivative sign, i.e. 


(Cu)’ = Cu’ (where C = const). Virtually, if y = Cu then 
Ay = y ( + Az) — y (2) = Cu (z + Az) — Cu (2) = 
= C lu (z + Az) —u(z)] = C Au 
in other words, if a function is multiplied by a constant its increment 
is multiplied by the same constant: A (Cu) = C Au. 
Hence, 
Palate OU tae ACA = a qo Ate ry 
Urn oF ay ae ae =O 


4. Formula for the Derivative of a Product. Let y = uv. Then 
Ay = (u + Au) (v + Av) — w = (Au) v + u Av + Au Av which 
implies 


' KATAY . (Au) v eo. whe A Au Av 
= lim ——= lim lim lim = 
y Ax+9 AT Ax—>0 Az Rar Ax "ane Ax 
a Au : Av A Au Av , 
= lim ——v-+ lim u —-+ lim —  Av=w'v+w’' +u'v'’-0 
Ax0 Az er Az es Az Ac a F 
Thus, 
(u = un + uv (6) 


For example, 
[(32? + 52) (42? — 6)]’ = (3x? + 52)’ (4x2 — 6) + 
. + (82? + 52) (4r? — 6)' = (6x + 5) (422 — 6) + 
+ (82? + 5x) 82 = 4823 + 60x? — 36x — 30 
From formula (6) we can easily deduce the formula for the deri- 
vative of a product of several factors. For example, 
(uvw)' = [(wv) w = (uv)! w + (wv) w = 
= (wv + w') w + www = u'vw + w'w + uvw' 


The formula for the derivative of a product of an arbitrary number 
of factors looks quite similar. We note that property 3 can be easily 
deduced from formula (6) by putting v = C. 


5. Formula for the Derivative of a Quotient. Let y= =. Then 
Ay u-+ Au CAS (Au) v— u" Av 
v-+Av v v (v+ Av) 
From this, representing Av which enters into the denominator in 
the form = Az, we obtain 


Az Az __ _u'v—w' 
0 
a0 o (vie Ar) v (v+ v’-0) 
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Thus, 


v2 


u\’ u'v— uv" 
($) -=-= (7) 
For example, 


5g2 \' (5x2)! (3x2 4) — (522) (312 +4)” _ 
(zar) F (os) aoe 


— 10z (32? +-4) — 522-62 40x 


Gat Gata 
6. The Derivative of a Composite Function. Let y =f (u), u = 
= ọ (x) and let y be regarded as a composite function of x. If x 
receives an increment Az then the intermediate variable u receives 
an increment Aw and therefore y receives an increment Ay too. 
We have 


Ae Ae Be (8) 

Now let Ax — 0. Then 5 — ux and hence Au = = Az > u,-0= 

= 0. Therefore $e — yu. Passing to the limit in formula (8) we 
obtain 

Yx = Yullx (9) 


The last formula may be rewritten as 
if (p @) =f (p @) g” (2) (10) 


In the case when a composite function is formed by means of 
a greater number of intermediate stages the derivative is computed 
in the same way. Thus, if y = y (u), u = u (v) and v = v (zx) then 
Ys = Yur Uy Ve 

For instance, let y = (x? — 5x + 3)8. Then we can denote y = u? 
where u = z? — 5x + 3, and by formula (9) we get 


Ys = Ya Us = (U8): (z? — 5x + 3), = 3u? (2x — 5) = 

= 3 (x? — 5a + 3)? (2x — 5) ; 
which is, of course, simpler than removing the brackets! In practical 
computations there is no need to write down all this in such a detailed 


manner. For instance, the former calculations can be put down as 
follows: 


[(z? — 5x + 8)3]’ = 3 (z? — 5x + 3)? (x? — 5z + 3)’ = 
= 3 (x? — 5x + 3)? (2x — 5) 


* Formula (8) is valid, of course, only if Au 0. But in case Au = 0 it 
is also easy to prove formula (10).—7r. 
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We use here formula (10), of course. After some practice one can 
write the result immediately without intermediate transformations. 
For this purpose it is advisable to remember the formulas for the 
derivatives in the form (u?)’ = 2uw’, (u3)’ = 3u?w’ and the like 
(the derivatives are taken with respect to z). 

7. The Derivative of an Inverse Function. Suppose the equality 
y = y (x) defines the inverse relation « = z (y) (see Sec. 1.24) for 
which we can determine the derivative z}. Then it is easy to compute 


the derivative of the original function y (x). Indeed, Ay =—+.,which 


Ae 
; Ay 
implies, as Az > 0 and Ay > 0, 
1 
ean 14 
Yx z (11) 


For example, let gy a which yields z= y?. Then 


ipa te W e Ue N 
HTa Py aya 


8. The Derivative of an Implicit Function. If a function is deter- 
mined in an implicit form F (x, y) = 0 (see Sec. 1.20) then to com- 
pute the derivative y% one should simply equate the derivatives of 
the left-hand side and of the right-hand side of the latter relation 
taking into account that y is a function of x which turns the relation 
into an identity. Generally, it is permissible to equate the deriva- 
tives of both sides of an equality if and only if the equality is an 
identity (but not an equation!). 

For instance, let us take 


y? 
Itis (12) 
Then 


27 2y? AN 2 
().+ (te), = Ws ie +r (13) 
Computing the derivative of the ear ie we have used 
inh i 1 Peay: 1 ' A > 
property 6: (42) =e U y= Wy . Thus, (13) implies 


ba 
aty 


Brant (14) 
5. Derivatives of Basic Elementary Functions. 
1. The Derivative of Sine. Let y = sin x. If the argument changes 
and becomes equal to z + Az then the function becomes equal to 
sin (x + Az). This implies 


Ay = sin (z + Az) —sinz=2 sin “2 cos (2+3) ; 


a E 
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TAZ Az 
[ 2sin -z 008 (2+) | 
_ M 
Az ag 


` : Ax 
= lim lim cos (z +) =1-cosz 
Ax->0 = Ax+0 ji 2 


[see formula (III.11)]. Hence, 
(sin x)’ = cos x (15) 
2. We leave it to the reader to verify in an analogous way that 
(cos z)’ = —sin z (16) 


3. The derivative of tangent is calculated by formula (7): 


i , 3 + : + 
(tan x)’ OF (= z) Be (sin z)’ cos x —sin z (cos £) = 
cos x cos? x 
cos x-cos c<—sin x (—sin z) A 
IIR nes, v AeA AE E eee 


cos? x ~~ Cos? z 
Sein A pesca 
4. Similarly, we can verify that (cotz) = —>5>- 


5. The Derivative of Arc Sine. Take y = arc sin z. Then z = 
= sin y and, by formula (411), 
1 4 4 4 4 


zy F (sin y)y = vosy + V1—sin?y  VWi=s 
We have written + in front of the radical sign because the values 
of arc sin z, as is well known, are taken in the interval -5< 


Yx= 


< arc sin z <= = which corresponds to non-negative values of 
cos y > 0. Thus, 


i 4 
arc sin z)’ =———— 
( ) Visa 


6. We verify similarly that 
(arc cos x)’ = Sr te ie 
Vi-2 
using the inequality 0 < arc cos z < 7. The resemblance between 
the last two results is explained by the formula 


(17) 


are sin z -+ are cose (18) 
which can be deduced in the following way: if we denote sina = x 
then 


cos (+ = 
aes 
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: 3 x 
and these formulas yield œ = arc sinz and z — & = arc cos «. 


The addition of the last results implies (18). 

7. The Derivative of Arc Tangent. Let y = arc tan z. Then z = 
= tan y. Using the formulas for the derivatives of an inverse func- 
tion and of the tangent we obtain 


ey A 
Mtge eee tent pA 
cos? y 
‘Thus, 
1 


4+22 


(arc tan x)’ = 


8. The Derivative of a Logarithmic Function. Take y=Inz. 
Then putting h= in formula (III.12) we see that 


2 A $ 
y'= lim “= lim 
Ax+0 AT Ax+0 


= lim 


ln (c+Az)—Inz 
Az Axo ÂT 


Therefore 
(In x)’ = i 


Applying formula (1.14) and taking into account that Ina = const 
we receive 


, l ý 4 , 
(log, 2)’ = (E5) (Inz)' t 


~ Ina z\na 
9. The Derivative of an Exponential Function. If y=a* then 


z=loggy and y% =. > ylna=a*lna 
v 


ylna 


Hence, 
(a*)’=a* Ina 


In particular (e*)’ = e*. 
10. The Derivative of a Power Function. According to the formula 
of the derivative of a composite function we have 


(ah)! = [(e" alt = (en In yy =erlnxyn 2 Sa zni == ng™1 


Thus, 
(2) e nz 
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This formula holds for any n, both integral and non-integral. 


1 
; RES ne CR fa ene em i) a eh 
For example, (z) = (2*)'=352 = Tye! (5) = (1) = 
a —-+ and the like. 
11. The Derivatives of Hyperbolic Functions. We have 
(sinh x)’ = ( a y e a la da = cosh x 


Similarly, 
(cosh z)’ = sinh z; 
(tanh 2)’ = (= = es) (sinh z)’ cosh z— sinh zx (cosh x)’ = 


cosh z ! cosh? x 
cosh?z—sinh?x 1, 
T cosh? x “cosh? x * 


(sinh zy = [In (2+ V2?+1)]'= 


1 =)’ 4 (x24)! 
= ea i ee peice oe 
a+ Ve2+1 (+V2+1) zt+V2+1 (1+ VaT) 

a 1 (1+ 2x )= 1 Væti+r_ 1 
at Vee IVA zt Vr F1 - V21 Vapi 


These formulas also demonstrate a rather close analogy between 
trigonometric and hyperbolic functions. 

12. The above formulas (comprising the table of basic differentia- 
tion* formulas) should be learnt by heart since they will be per- 
manently used in what follows. With the help of the formulas it 
is possible to compute the derivative of any elementary function by 
using the rules of Sec. 4. For example, 


r gotan ENK e CA gotan pe a on a 


But 


1 
Cee 


= 3) 2 


wl 


and, by the formula of the derivative of a composite function, 
we obtain 


gpa say = otan 5x In2 (tan 52) $ gtan 5x ln 2 a = (5x)' S 
SA ou E 
ae lo aiar 
* The operation of finding the derivative of a function is usually called 
differentiation (see Sec. 8).— Tr. 
1410—0141 
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Taking the common factor outside the brackets we derive 
otan 5x 


3/7, otan 5x)’ T 
(z2 ) ava (1 +15 In2 aa) 
After some practice calculations of this type can be carried out much 
faster without intermediate transformations. 

13. In some cases it is useful to take logarithms before calcula- 
ting a derivative. For example, let it be necessary to find the deri- 
vative (zsinx)’. Then we write y = zn 1; In y = sin z lng and 
(In y) = (sin z ln 2)’. Therefore, 


1 : 1 
jy =coszlna + sina > 


To calculate the left-hand side we have applied the formula of the 
derivative of a composite function. Finally, from the last relation, 


jn Ea e Scns Es (cos Inz+ sinz—) 


This method is sometimes used when it is necessary to find the 
derivative of the product of several factors since after taking the 
logarithm the product turns into a sum, and, generally speaking, 
it is easier to find the derivative of the sum than that of the product. 

6. Determining Tangent in Polar Coordinates. The problem of 
determining the tangent to a curve which is represented by its equa- 
tion in Cartesian coordinates was solved in Sec. 3. Now let a curve 
be given by its equation p = f (ọ) in polar coordinates. To determine 
the tangent we could transform the equation into Cartesian coordi- 
nates but it is simpler to solve the problem directly in polar coor- 
dinates. Let the position of the tangent be determined by the angle 0 
(see Fig. 114). Let us give @ a small increment Ag and let us con- 
sider the infinitesimal curvilinear triangle MNP formed by the 
two coordinate lines and the graph of p = f (ọ). For the sake of 
convenience the triangle is depicted in Fig. 114 separately. The 
triangle may be regarded asa “genuine” triangle (i.e. as a rectilinear 
triangle) to within to infinitesimals of higher order and even 
as a rectangular triangle since » N = 90°. (Why is it so?) This 
implies 


i ; Ap Seis 
cot 0 = lim cot 0*= lim ——=— 
Ag—0 Apso PAP P Po (19) 


Example 1. For the logarithmic spiral (see Sec. II.5) we have 
cot 0 = oe (ket?) = k = const 
e 


Thus, the spiral intersects all the coordinate rays forming one and 
the same angle with them. It is easy to find the relationship between 
this property and the property indicated in Sec. II.5: they turn 


out to be equivalent. 


PON E R 


Se N 
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Example 2. Let us consider the polar equation of a parabola with 
respect to its focus li.e. equation (11.29) in which we should put 


Fig. 114 Fig. 115 
e = 1]. By formula (19) we have 
3 ; « 2sin L cos -& 
REE itcosp psing ___ Sin@ _ 2 2i 
== P (1 -+ cos p)? 1- cos p zuat 
2 
TE Pte EN 
=tan-$=cot (3 F); 
ERPS A 
ee 2 


Therefore if we draw a straight line parallel to the polar axis and 
passing through the point M we shall have (see Fig. 115) æ + 6 = 


= n — p, that is 
a=n—Ņ—9=nr— p— ($-4) =>—$=0=6 
From this we obtain the basic optical property of a parabola: if the 


light propagates in the plane of the parabola from a light source 
placed in its focus then all the rays reflected by the parabola are 


a= fr 


Fig. 116 


10* 
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parallel to its axis. Here lies the explanation of the fact that the 
reflector of a projector is shaped in the form of a surface generated 
by the revolution of a parabola about its axis (the parabolic reflector). 

It is a little more complicated to deduce analogous optical pro- 
perties of an ellipse and of a hyperbola. For instance, one can show 
that all the rays issued from a focus of an ellipse gather at the other 
focus after being reflected from the ellipse (see Fig. 116). A parabola 
may be regarded as an ellipse with one of its focuses removed to 
infinity (see the end of Sec. II.12) and therefore the optical property 
of a parabola is implied by that of an ellipse if we pass to the limit. 


§ 2. Differential 


7. Physical Examples. The notion of a differential is closely 
related to that of a derivative and is also one of the most important 
notions in mathematics. We shall illustrate it by considering the 
same examples as in Sec. 1. 

Let a point have the velocity v = s; = f' (t) at a moment ¢ in 
its rectilinear motion according to the law of motion s = f (i). 
If now some additional time Az passes the point will cover some 
additional distance As. In case the motion is non-uniform the depen- 
dence of As on At can be complicated because the velocity of motion 
varies all the time. But if Aż (the time passed) is not large the velocity 
has no time to change considerably during the period of time from 
zt to t + At. Therefore the motion may be regarded as “almost uni- 
form” during this period. Hence, reckoning the distance we shall 
not get a serious error if we regard the motion as uniform, i.e. as 
having a constant velocity, namely, the velocity it had at the 
moment t. 

Thus we obtain the distance v At = s; At. The distance is directly 
proportional to the time passed At. s; At is called the differential 
of the distance and is denoted as ds: ds = s; At (the symbol ds should 
be understood as an indivisible symbol and not as the product of d 
by s). The real distance As differs from the “invented” distance ds, 
of course, since the velocity may change during the period At no 
matter how small Aż is. But nevertheless if this period is small 
enough we can put approximately 

As ~ ds (20) 


But the smaller Aż, the smaller the change of the velocity. There- 
fore the accuracy of formula (20) becomes greater as At is decreased. 
In Sec. 8 we shall show that when the interval Aż is infinitesimal 
the difference between As and ds is an infinitesimal variable of higher 
order relative to As. There are many situations when it is permis- 
sible to neglect such infinitesimals of higher order. Then it is pos- 
sible to say that the differential of the distance is nothing but an 
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infinitesimal distance, i.e. the distance corresponding to an infi- 
nitesimal interval of time. At the same time, of course, the diffe- 
rential of the distance may not be an infinitesimal at all in case 
At is not small, but the greater At, the lower the degree of accuracy 
of formula (20). Nevertheless, it is much easier to compute ds as 
a distance passed in a uniform motion than to evaluate the real 
distance As; this accounts for the fact that formula (20) is often 
used even when Aż is not very small. 

Turning to the second example and reasoning in the same way 
we can say that the differential of the volume dV represents the 
volume which would be filled if the rate of filling remained constant 
and equal to the rate at the moment ¢ during the period of time 
from t to t + At, that is dV = V;At. Similarly, the differential of 
the mass in the third example is the mass which the part AB of the 
curve (see Fig. 110) would have if the density of this part were con- 
stant and equal to the density at the point A, that is dM = p As = 
= M,As. 

In all cases the replacement of a real change of a quantity by its 
differential reduces to the transition from some non-uniform pro- 
cesses, non-homogeneous objects, etc. to the uniform and homo- 
geneous ones. Such a replacement is always based upon the fact 
that every process is “almost uniform” during a small interval of 
time and every object is “almost homogeneous” in the small and 
so on. 

8. Definition of Differential and Its Connection with Increment. 
Now we shall give the general definition of a differential. Let the 
argument of a function y = f (z) first take a value x and then receive 
an increment Az. Then the differential of the function is the product 


dy = df (x) = y' Ax =f (x) Ax (21) 


The differential is therefore the increment which the function would 
receive if it changed in the interval from z to x + Az with the same 
velocity as for the value x of the argument. 

The operation of finding the differential of a function is called 
differentiation of the function; it is carried out quite simply by means 
of formula (21). For example, let y = sin z; then dy = (sin we) Ae = 
= cos z Az, i.e. dsinz = cos z Az. 


Similarly, dtanz = — Az, d (a) = 3z? Ax and the like. 


Thus, when differentiating a function one must find its derivative 
and multiply the result by Az; therefore the operation of computing 
the derivative is also often called differentiation. But one must 
take care not to confuse the derivative with the differential. The 
derivative of a function y = f (x) depends only on x whereas the 
differential also depends on Az. In practical applications the diffe- 
rential is usually regarded as an infinitesimal whereas the derivative 
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is understood as a finite value. In case variables x and y have certain 
dimensions 

ae ry TAn] i Tul 

We note, in particular, that 

dx = x, Ax = 1 Ax. = Ax 
which means that the differential of an independent variable is equal 
to its increment. This makes it possible to rewrite formula (21): 

dy = f' (x) da = y' dx (22) 

and, on the other hand, to represent the derivative as the ratio of 
the differentials: 


y=% or, which is the same, f’ () =H 


The geometrical meaning of the differential of a function is shown 
in Fig. 117: the differential is equal to the increment of the ordinate 


Ax 
Fig. 117 Fig. 118 
BD=Ay, BC=AB tan a=Ax-y’=dy Ay=si+so+s3, dy=sitss , (Ax)2=s2 


of the tangent. Hence, the replacement of the increment of a function 
by its differential is equivalent to the replacement of the graph of 
the function by the segment of the tangent drawn through the point 
A. This replacement is justified in case Az is small enough. 

To investigate the connection between the differential and the 


y s A ~ A 
increment we take into ac et SEES r y 
o account that tease hee ay +o 


where œ —> 0 as Ax — 0. This yields 
Ay =y' Az +a Az = dy + ĝ (23) 


where B =a Az is an infinitesimal variable of higher order than 
Az (it is represented by the segment CD in Fig. 117). The equality 
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(23) can be formulated as the differential is the principal linear part . 
of the increment of a function. It is called here the principal part 
since the difference between the differential and the increment is 
the infinitesimal B of higher order and it is called the linear part 
since it is directly proportional to Az (compare with Sec. 1.22). 
If y’ = 0 then dy and dr = Az are infinitesimals of the same order 
and therefore f in formula (23) is an infinitesimal of higher order 
than dy, that is dy and Ay are equivalent infinitesimals (see Sec. 
II.8). 

Let us take an example to illustrate the error which occurs if the 
increment of a function is replaced by its differential. Let y = 2? 
and let the argument first assume the value £ = 4 and then receive 
the increment Az. We have 


Ay = (1 + Az)? — 1? = 2 Az + Az’; 
dy = y' Az = 2-1-Azr = 2 Az 


Therefore, Ay and dy differ from each other by the infinitesimal 
(Az)? of the second order (see Fig. 118). In particular, 


for Ar = 0.1 Ay= 0.24; dy= 0.2; the error is 5 per 
cent; 

for Ac= 0.01 Ay= 0.0204; dy= 0.02; the error is 0,5 per 
cent; 

for Ac=—0.001 Ay= — 0.001999; dy = — 0.002; the erroris0.05 per 
cent etc. 


It is obvious here that the relative error generated by the repla- 
cement of Ay by dy decreases rapidly as | Az | decreases. 

A function which has the differential is called differentiable. 
In other words, a differentiable function is a function such that its 
small increment has the principal linear part, i.e. a function which 
may be approximately replaced by a linear function on every small 
interval of change of the argument (such a replacement is the so- 
called process of linearizing). A differentiable function must have 
a finite derivative, and the function itself must be continuous for 
the considered values of the argument since (23) shows that Ay 
is infinitesimal when Az is infinitesimal. At the same time a con- 
tinuous function may turn out to be non-differentiable at some 
points. For instance, the function shown in Fig. 113 is non-differen- 
tiable not only at the point of discontinuity x = x, but also at the 
points = x and x = z, where it is continuous. B. Bolzano, a 
famous Czech mathematician (1781-1848), and, independently of 
him, K. Weierstrass (1815-1897), a prominent German mathemati- 
cian, discovered (the former in 1830 and the latter in 1860; Bolzano’s 
result was not published) the existence of continuous functions which 
are non-differentiable for all values of the argument. Such functions 
had been considered a mathematical trick for a long time but it 
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turned out that they were of essential importance for describing 
processes of the type of a Brownian motion. We shall not pay atten- 
tion to the possibility of existence of such “monsters” in our intro- 
ductory course. 

9. Properties of Differential. The differential of a function is 
obtained by multiplying its derivative by the differential of the 
argument and therefore each property of the derivative (see Sec. 4) 
obviously implies the corresponding property of the differential. 
For example, multiplying both parts of the equality (u -+ v) = 
= u' + v' by da we receive (u + v)’ dx = u’ dx + v’ dx or, which 
is the same, > 
d (u + v) = du + dv 


(i.e. the differential of a sum is equal to the sum of the differentials). 
Similarly, we deduce the formula 


£ d (w) = (du) v + u dv (24) 


and the like. We shall see in Sec. IX.12 that these formulas also 
hold for the case of an arbitrary number of independent variables. 

The implication of the formula for the derivative of a composite 
function is of special importance. Let y = f (x) and let z first be 
an independent variable. Then each of the formulas (21) and (22) 
can be used for calculating dy since in this case Ax = dz. Now let x 
depend on a third variable, for example, z = z (i). Then Az + dx 
but it turns out that nevertheless formula (22) remains true [whereas 
formula (21) does not hold, in general]. Virtually, 


dy = yi dt = yzr dt = y; dx 
which is what we set out to prove. Therefore it is natural to use 
formula (22) [and not (24)] for calculating the differential since 
this formula remains true (invariant) in all cases. 


Now we shall apply this invariance property* to computing the 
derivative of a function represented parametrically (see Sec. II.6). 


Let x = z (t) and y = y (t) (t is a parameter). Then dz = z dt and 
dy = y dt (the dot usually denotes the derivative with respect to 
a parameter) which implies 


y= (25) 


All the properties of differentials are used, in particular, for 
linearizing relations between variables, that is for passing from 
a general, non-linear, relation to the linear relation between the 
increments of the variables. Such a linearization is possible in case 


* This property is usually called the invariance of the form of the diffe- 
rential.— Tr. 
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the changes of the variables are small, and it is based on dropping 
infinitesimals of higher order. 

Thus, for instance, equation (I1.30) characterizes the non-linear 
relation between the coordinates of a point M (x, y) belonging 
to a curve of the second order. But now let the position of the point 
M change near a fixed point Mo (zo, yo), that is let the increments 
x—2 =Ẹ and y—yo = be small. Differentiating equation 
(11.30) and replacing the differentials by the increments we come 
to the linearized equation 


2Az + 2Byo& + 2Bxoyn + 2Cym + DE + En =0 (26) 


which describes the approximate linear relation between Ẹ and n. 
Since when deducing equation (26) we replaced the differentials 
dx and dy by the increments Ẹ and n, the point M of the line satisfies 
the equation only with an accuracy of the infinitesimals of higher 
order. Equation (26) is precisely satisfied by the points of the tan- 
gent line to curve (II.30) drawn through the point Mo. 

The linearization is widely used in physics, in particular, when 
differential equations are deduced (see Sec. XIV.6). 

10. Application of Differentials to Approximate Calculations. 
Differentials are widely used for approximate calculations. First 
of all, the increment of a function is often replaced by its differen- 
tial which, as a rule, can be found in a simpler way. 

Suppose we are given a function y = f (z). Let a particular value 
f (a) be known. Let the argument z receive a small increment Ac = 
= h. Then we can put 


f (a + k) — f (a) = Ay ~ dy = f' @h 
fla +h) x fla) +f (ah (27) 


Choosing the concrete functions j/z, sinz, Inz and so forth as 
f (x) we derive the approximate formulas 
Ya thax a + = ee 


on Yari 


that is 


sin (a+ h) = sina +h cosa (28) 
In (ah) © Ina++ ; 


etc. which are applicable for small |% |. The formulas can be speci- 
fied and their errors can be effectively estimated. This question 
will be discussed in Secs. 15-416. 

Let us consider an example. Suppose we know that In 2 = 0.693. 
Then calculating with an accuracy of 0.001 we get 


In 2.4 = In (2+0.1) ~ In 2+% = 0.693 +0.050 =0.748 
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The table of logarithms gives the value In 2.4 = 0.742, i.e. the 
error is smaller than 0.2 per cent. 

It is sometimes necessary to transform the expression that must 
be calculated in order to facilitate the calculations. For instance, 


it is wrong to calculate j/2 as 

Y= TH aw YT a =1+4+5= 1.338 
since the value A = 4 can hardly be regarded as small in comparison 
with a = 1. It is convenient to put here »/2 = re and to choose 


the integer m so that 2m? should become as close as possible to an 
exact cube of an integer. It is possible to take m = 4 since 2-43 = 
= 128 is close to 125 = 5%; then we get 


Va=7 V 2B at V B= 5/15 $3 x 


=7 (15+; Toa) =; (5 +5) =4x 5.0400 = 1.2600 


Tables of roots yield the value Ka 2 = 1.2599, i.e. the error is smaller 
than 0.01 per cent. 


Differentials are also used for estimating errors. Suppose, the vari- 
ables z and y are connected by a functional relation y =f (x), and 


let the approximate value x of the argument z be known with the 


maximum absolute error æy (see Sec. I.7). Then, of course, y = f (2) 
should be taken as an approximate value of y. To estimate the 


maximum absolute error @y we observe that x =z -+h where 
|h |< a, and therefore, if œ, (and, consequently, 2) is small, then 


y=y+Ay~ytdy=y+f' @h 


that is 


ly—yl~ IF A< I (ax 
Thus, we can put 


ety =|f" (2) | ee (29) 
For example, let y =x”. Then 
Oty =| 23" | oy 


and the corresponding maximum relative errors are connected by 
the simple formula 


6 Qy _ [nzr-1| ay |n| ax 
Yy 


lyl Jel" |x| 
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As another example let us consider In 10.7 where the 
value 10.7 is approximate and is known with an accuracy of 0.1. 
We have In 10.7 = 2.3702 according to the table but it is obvious 
that this result contains too many decimal digits. To understand 
what the accuracy of the result is we must take into account that 
in our case a, = 0.1 which implies 


1 


that is the result should be put down in the form In 10.7 = 2.37. 


§ 3. Derivatives and Differentials of Higher Orders 


11. Derivatives of Higher Orders. Let y = f (x). Then the deriva- 
tive y' = f' (x) which was studied in § 4 is called the derivative 
of the first order or the first derivative of the function f (x). In its 
turn, f’ (zx) is also a function of z and therefore it is possible to take 
its derivative which is ‘called the derivative of the second order 
or the second derivative of the original function: 


y” =(y')' =f @ 
In the same way we define the derivative of the third order (the 
third derivative): 


y" =(y')' =f" @) 


The consequent derivatives are denoted as y(4) = yY, y() = yY 
etc. For example, (2°) = 32°, (2*)” = (32) = 62; (sin x)’ = cos x, 
(sin z)" = (cos z)’ = —sin z and the like. The derivative of the 
second order sometimes has a clear physical meaning: thus, in the 
first example of Sec. 1 the derivative of the second order of the dis- 
tance with respect to the time is the velocity of change of the in- 
stantaneous velocity, that is the instantaneous acceleration. We 
shall discuss the applications of the derivatives of higher orders 
in Sec. 15 and further. 

The formula for the derivative of a sum is quite simple. If y = 
=u-+v then y =u +v, y'= (u' +v) =u"+v" and so 
on. Generally, 

(u + v)™ = u™ + vm 


As for the formula for the derivative of a product, we have 
(uv) = u'v + ww’, 
(w) = (uwv + w'y = u"v + uv 4+ uv + w" = 
u"v + 2u'v' + w”, 
(uv)” = (u"v + 2u'v' + uv")! = u"v + 2u"v! 4 u'u" + u'v' + / 
+ 2u'v’ + uv” = u"v + 3u"v' + 3u’v" + w" etc. (30) 
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Here computing the next derivative we begin with differentiating 
the first factors in all the terms and after that we differentiate the 
second factors in all the terms. These computations are carried out 
according to the scheme resembling that of expanding the expres- 
sions (a + b)?, (a + b) and so forth: 


(a + b)? = (a+ b) (a + b) = a + ab + ab + b = 
=a? + 2ab + b, 
(a + b)? = (a? + 2ab + b?) (a + b) = aè? + 2a°b + ab? + 
+ a?b + 2ab? + b? = a® + 3a°b + 3ab? + b? etc. (84) 


Therefore the coefficients in formulas (30) are the same as in formu- 
las (31). In the general case the formula (the Leibniz rule) can be 
written as 


(uv) ™ = uv + (a ) ur) p! + ( 5 ) ur) yp” +... tuv™ (32) 


where E) ‘ (3) , +... are the so-called binomial coefficients, i.e. 


the coefficients occurring in the binomial formula [the expansion 
of the power (a + b)"]. 

The calculation of the derivative of an implicit function will be 
illustrated by taking equation (12). Further differentiation of equali- 
ty (13) yields 

2 2 ron ” : n ba 1 y’2 
setae yy’ t+yy)=0, ie y = (iti) 
E; Bb 7 4 bazy bå y2 dA b4 
Sota) =—alete) mae 2) 
[here we have used expression (14) for y’]. The following derivatives 
are computed in a similar way. Formula (33) can also be obtained 
by differentiating equality (14). 

In addition, we shall turn to differentiating a function represen- 

ted by its parametric equations. Differentiating formula (25) we get 


Nel Bs iene 
a dt z3 
(the two dots denote the second derivative with respect to a para- 
meter). Here, as in deducing formula (25), we need not pay atten- 
tion to the fact which of the variables is independent when we calcu- 
late the differential (the first differential, as we shall call it in Sec. 12). 
The derivatives of higher order can be found similarly. 

12. Higher-Order Differentials. Let y = f (x). Then 


dy = f' (x) dx (34) 
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The differential calculated by formula (34) is called the first diffe- 
rential (the first-order differential). The differential of the second 
order (the second-order differential) of a function is the first diffe- 
rential of the first differential of the function. It is denoted by d?y. 
If « is an independent variable then in the process of the repeated 
differentiation the quantity dz is regarded as being independent 
of x. Thus, dx being a constant, we take it outside the differentiation 
sign: 


dy = d (dy) = d (f (x) dz) = d (f (x)) dx = 


= (f' (a))’ dx dx = f" (x) dz* (35) 
where the notation dz? = (dz)? is assumed. Similarly, we obtain 
By = d (d’y) = f” (x) da? (36) 


and so on. This enables us to write the derivatives of higher order 
as the ratios of the corresponding differentials: 


„n ay m __ dy 
=a Y= ote. (37) 


In addition, we see that if dy is an infinitesimal of the first order 
relative to dx then d?y is an infinitesimal of the second order, dèy 
is an infinitesimal of the third order and so on. Further, we note that 


@x = d (dx) = d (1-dx) = dz d (1) = 0 


that is the second-order differential of an independent variable is equal 
to zero; of course, all the following differentials of an independent 
variable are also equal to zero. 

In case z is not an independent variable (or we do not know whether 
it is independent or not) formula (34), as it was seen in Sec. 8, is 
nevertheless true. But using this formula for subsequent differentia- 
tions we must not consider dz constant but should use the rule of 
differentiating a product [see formula (24)]: 


dy = d (f' (x) dx) = d (f (x))-de + f' (x) d (dz) = 
= f" (x) dx? + f (2) Px (38) 


In a similar way we find 
By = f” (x) da + 3f" (x) dx dr + f' (x) Bx (39) 


(check it!) and all the following differentials. If now it turns out 
that æ is an independent variable then dx = d's = 0, and formula 
(38) turns into formula (35) and formula (39) turns into formula 
(36). Thus, formulas (35)-(87) should be used only in case x is an inde- 
pendent variable. 

The notion of the derivative and that of the differential and all 
the basic rules of operating on them were elaborated by Newton 
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(4666) and by Leibniz (1684) although these notions had been used 
for some particular problems before. 3 
Differential calculus has very many applications in the field of 
investigating the behaviour of functions. These applications will 

be considered in the forthcoming sections of our course. 


$ 4. L'Hospital’s Rule 


13. Indeterminate Forms of the Type = . We said in Sec. II.7 that 


the evaluation of the limit of theratio of two infinitesimals might 
yield different results in different cases. J. Bernoulli discovered a 
simple rule for evaluating such a limit which is applicable to many 
cases. This rule was published in 1696 in the first printed textbook 
on differential calculus written by L’Hospital, a French mathema- 
tician (1661-1704). Let it be necessary to evaluate the limit 


~ lt) 
; lim 50 Go 


and 
@ (to) =P (to) = 0 (41) 


that is let us have an indeterminate form of the type > Suppose 
that the limit (finite or infinite) 


ot) 
ee ON Ge 


is found in some way. Then we assert that the limit (40) is also equal 
to k, that is we have 


TEL 0) im & 
lin pO on PO a 


for indeterminate forms of the type E 


In order to prove the rule let us consider the curve y = @ (t), 

x =w(é) in the z, y-plane. Then, as t— to, by (41), this curve 
approaches the origin of the coordinate system. To find out in what 
way it approaches the origin (i.e. like a spiral or along a certain 
direction which then should be specified etc.) we note that, by (42), 
we have 

dy p (t)dt _ p’ (t) A 

eS yor we (es t—> to) 
Hence (see Fig. 119), while the curve approaches the origin the 
tangent to the curve turns in this process and tends to the limiting 
position in which it forms an angle a with the z-axis such that 
tan @ = k. But then the angle f (“the angle of elevation”) also 
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tends to a, i.e. 
() oy 
Sey ae = HB ze tan a =k (44) 


which is just what we set out to prove. 
It sometimes happens that using L’Hospital’s rule we obtain 
a ney of derivatives which is again an indeterminate form of the 


type 2 iy ; then it is possible to apply the rule again and so forth. 
For o 


lim =n — (2) = lim ize (2 +) =lim m 22 = (2) = 


x0 z3 0 x0 x0 
Slim 2? fe As 
xi) 6 6’ 
. 2¥—4X2-* 0 smn 2*In2-+4X2-* In 2 áln2 
im iad) d+ IME | al a 


We have applied L’Hospital’s rule three times in the first example 
and once in the second example. 

L’Hospital’s rule for indeterminate forms of type (40) always 
achieves the aim in case fp is a finite number and the numerator 


Fig. 119 Fig. 120 


and the denominator are infinitesimals of an integral order of small- 
ness relative to ¢ — fo (see Sec. III.10). Indeed, L’Hospital’s rule 
implies that each differentiation reduces the order of the infinitesi- 
mals by unity, and after several steps we shall obtain variables 
of the “zero order” (that is having a finite limit) in the numerator 
or in the denominator (or in both of them) and thus we shall no 
longer have an indeterminate form. 
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14. Indeterminate Forms of the Type =. L’Hospital’s rule (43) 


also remains true for indeterminate forms of the type =, i.e. when 
the condition 


|p (to) | = |p (to) | = œ 
is put instead of (44). 

The proof for this case is analogous to the one given in Sec. 13 
but now the curve y = @ (#), z = p (¢) does not approach the origin 
of the coordinate system as t > to but travels to infinity (see Fig. 120) 
In this process the curve, by condition (42), turns in such a wa 
that the angle æ which the curve (i.e. its tangent) forms with th: 
z-axis tends to œo where tan a) = k. But then the distance passed 
by the point M of the curve along the straight line ll (see Fig. 120) 
will be an infinitely large variable of higher order than the distance 
along the straight line transversal to ll, namely, MM’ < OM. 
Hence, “~MOM' — 0 as the point M travels to infinity and there- 
fore the “angle of elevation” ß tends to a» and we can write the same 
formula (44) again which concludes the proof. 

We give here several important examples: 


4 
E o0 $ Ez z 4 4 
o N a E em 
. x oo i 4 1 
in (S)=1 aw =? (a>1), 
lim — = lim (—* )'= Z) =0=0 
X00 bx lim ( RE X00 ( ax ) o 


(k>0, b>1, a="/0) 

Hence, a logarithmic function with a base greater than unity 
tends to infinity (as its argument tends to infinity) slower than any 
power function with a positive exponent. Besides, a power function 
tends to infinity slower than any exponential function with a base 
greater than unity. 

L’Hospital’s rule can be applied to some indeterminate forms 
of other types (see Sec. III.5, property 6 and the beginning of 


Sec. III.15) after they are transformed into forms of the type + or 

=. This can be achieved according to the following scheme: 
ea E BA O00, 

These formulas should be understood in a conditional sense. We 

use them only to indicate the types of the variables. After taking 

logarithms we can also apply L’Hospital’s rule to power indetermi- 

nate forms. 


DERIVATIVES, DIFFERENTIALS, BEHAVIOUR OF FUNCTIONS 161 


$ 5. Taylor's Formula and Series 


15. Taylor’s Formula. It was shown in Sec. 10 that replacing 
the increment of a function by its differential we can deduce many 
approximate formulas. It turns out that these formulas can be made 
much more accurate if we apply differentials of higher order. This 
problem is solved by means of Taylor's formula named after 
B. Taylor (1685-1731), an English mathematician. 

Let us first suppose that we are given a polynomial P (x). A poly- 
nomial is usually considered as expanded into powers of x but it 
is quite easy to expand it in powers of x — a where a is an arbitrary 
number. 

Suppose, for example, that we are going to expand the polynomial 
P (x) = 5 — 3x + 22° in powers of x — 4. In order to do this it 
is sufficient to substitute 2 = [4 + (x — 4)] and then remove the 
square brackets without removing the parentheses: 

P(z) = 5 — 3 [4 + (z — 4) + 2 [4 + (z — 4)? = 
= 5 — 12 — 3 (z — 4) + 128 + 96 (z — 4) + 24 (z — 4)? + 
+ 2 (x — 4) = 124 + 93 (x — 4) + 24 (x — 4)? + 2 (x — 4) 
In the general case, for a polynomial of degree n, we can write 
P (x) = ao + & (£ — a) + az (x — a)? + 
+ a(z — a} +... +a (z — a)" (45) 
The coefficients here can be found in the following way. First we 
put z =a and obtain P (a) =a. Then we differentiate formu- 
la (45): 
P' (x) = a, + 2az (£ — a) + 3a, (£ — a)? +... 
oes Nan (£ — a)” 
If now we put here z = a this will yield P’ (a) = a. Let us diffe- 
rentiate once again: 
P" (xz) = 1 X 2a; +2 X 3a; (x —a) +... 
.. . + (n — 1) na, (z — a)” 


which implies P” (a) = 1 < 2a,. Further, in a similar way we 
derive P” (a) =1x 2 X 3a, and so on. Generally, P® (a) = 
=kla, (where k!= x12...) which yields 

a, = p® (a)/k! 
Thus, formula (45) can be rewritten in the following form: 


P (a) =P (a) +2 (e—a) + =e) (we—a)?+... 


$22 (@—ay'=P (a+ SAO (@—a)* (46) 
k=ł 


n! 


where J! is the summation sign (see Sec. II.6). 
110444 
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For example, taking the polynomial from the previous example 
we deduce 


P' (x) = —3 +622, Aer P" (x) = 12, 
P' (4) 


P(4)=5—3x 4+2x4=121, = -=—3+6 x 4=93, 
É” (4) 12x4 P" (4j 42 
yeep, =e 


that is we obtain the same values of the coefficients as before. 

If now we take an arbitrary function f (x) in place of a polynomial 
P (x) then formula (46) will no longer hold. But if we denote the 
difference between the left-hand and the right-hand sides of formu- 
la (46) by R, (z) (the remainder) then we can write 


f(a) =F (a) +O (@—a) + LO e—a 
$29 ca)" 4. Ba (2) (47) 


It is this formula that is called Taylor’s formula. The most essen- 
tial thing about the formula is that the remainder is an infinitesimal 
of at least (n + 1)th order relative to x — a as xa, that is R, (z) 
is an infinitesimal of higher order than the last of the “exact” terms 
put down in formula (47). In order to prove this assertion let us 
suppose, for simplicity’s sake, that n = 2, that is 


f(a) = (a) + FO (2@—a) + 2 (e—a) R (2) 


Finding R, (x) from this and applying L’Hospital’s rule (see 
Sec. 13) we receive 


i Ra (a) _ f@)—f@—f' (@ (e—a)—L eat 
xaa (1—4) xg (2—4)? = 
(9) _). ff @—f @—f" @ (ca) 1/0 
=(z) = lim 3(@—ap =(7)= 
ym f" @=f (a) 0 J" (2) — fl" (a) 
= lim zea =(7) =a SP 3! I ey 
= 


i.e. the ratio has a finite limit as z — a. This just implies 


(see Sec. III. 10) c aa assertion about the order of smallness of R, (x), 
and an analogous consideration holds for R, (z). 

Let us denote x = a + h. We see that dropping the remainder 
in formula (47) for successively increasing values of n we shall 
obtain approximate formulas with, respectively, increasing degrees 
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of accuracy (for small values of | k |): 
fat+thxf@+f (ah ; (49) 


[this formula coincides with formula (27) and guarantees an accu- 
racy to a term of the order of h*], 


f(a+-h) = f(a)+f (ayh+ Ore (50) 
with an accuracy of the order of h’, 
Hath) x flar h+ EE m4 lors (51) 


with an accuracy of the order of h* and so forth. 

The polynomials (in k = « — a) entering into the right-hand 
sides are called Taylor’s polynomials. They yield, in some sense, 
the best approximate expression of a function f (z) in the form of 
a polynomial of a given degree near the value z = a. Namely, 
a Taylor polynomial differs from f(x) by a term which is an infinite- 
simal of the highest order in comparison with all the polynomials 
of the same degree as zx a. For instance, even if we change only 
one of the coefficients on the right-hand side of (50) then the diffe- 
rence may become an infinitesimal of order 0, 4 or 2 but not an 
infinitesimal of the third order relative to x — a (as x —> a). 

16. Taylor’s Series. Since the errors of formulas (49), (50), (51) 
etc. are becoming infinitesimals of greater and still greater order 
(as n—> oo) it is quite natural to expect that for small |> | it is 
permissible to pass to the limit and thus obtain an “exact” repre- 
sentation of f (a + hk) in the form of a sum of an infinite series (see 
Sec. III.6), that is 


fatm=f()+ FP n+ SP ay. + Pan. 


nl 


=f@+ o (52) 


n=1 


This series is called Taylor’s series; it was originally introduced by 
B. Taylor in 1715. Such series will be systematically treated in 
§ 3 of Chapter XVII. It will be shown there that the above-mentioned 
supposition is true. By the way, we shall answer the question what. 
particular values of h guarantee the possibility of using formula (52). 
It turns out that it is always permissible to apply the formula if 
the series is practically convergent in the sense described in the 
end of Sec. III.6 [but in this case the function which should be 
expanded into Taylor’s series must not be represented by different 
formulas on different parts of the range of its argument (see Sec. 1.13)]. 
Taking this comment into account we shall now use Taylor’s series. 


11% 
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Formula (52) can be rewritten (by substituting a + h = z and 
h = x — a) in. the form 


‘(a if < 
tort (@@—a) + HO aah nnn 63) 
which is an expansion into powers of x—a. In particular, when 
a=0 we obtain an expansion in powers of z: 


fay=FQ4+L0 24 HM w+... (54) 


Series (54) is sometimes called Maclaurin’s series after C. Maclauria 
(4698-1746), a Scotch mathematician, which is historically incorrect. 
For example, let f(z) = e". Then f’ (£) = e, f (oye ans 


and 
f@=1, F M=1, f 0 =1, 
Hence formula (54) results here in 


ree! Lege 1 E 
E=- statta- (55) 


Let us evaluate the number e with an accuracy of 0.001. In order 
to do this let us put z = 1 and evaluate the terms one after another 
Tetaining a reserve decimal digit: 


e = 1.0000 + 1.0000 -+ 0.5000 + 0.1667 + 0.0417 + 
+ 0.0083 + 0.0014 -++ 0.0002 -+ 0.0000 


Every subsequent term here is obtained by dividing the foregoing 
term by the next integer. As it is seen, the terms of the series mani- 
fest an obvious tendency to decrease: fast and, in addition, they 
soon become less than the required degree of accuracy. Now sum- 
-ming up and rounding the result we obtain e = 2.718. 

In a way similar to that of deducing (55) it is possible to receive 
the following formulas (we leave this to the reader): 


cosr-1—-S +--+. (56) 
sinn=a—2e+E—F +... (57) 
cosh a=4 + H+ Bet PES (58) 
sihra pHa A (59) 


arpir(i (AEA OD 
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(for any arbitrary p) where the binomial coefficients ( 4 ) are defined 


by the formulas (1) =p (2)=45>. er) (t)= 


—2e—) eo. If p is a positive integer, (21> 


P \—...=0 and series (60) turns into a finite sum. 


p+2 
We thus obtain the expression for binomial coefficient. 
For the, logarithmic function we similarly derive 


In(ita)=2—-S4+5-FHt.-: (93) 


J. L. Lagrange (1736-1813), a prominent French mathematician, 
proved that the remainder R, (z) in formula (47) admits the repre- 
sentation 


(n+l) 
Ra (x) = ir (x A a)" 


where Ẹ is a certain value lying between a and z. This representa- 
tion sometimes makes it possible to find out for what values of x 


formula (53) holds because the formula is true if and only if Rn (x) > 


-> 0 as n—> oo. For instance, if we take into account that a = 0 
for any h it is easy to prove that formulas (55)-(59) are true for any 
zx (think it over!). 

Taylor’s series can be rewritten in another form if we denote 
ru An; f(a) —f(@) = Sf f la) (e—a) =f (a) Ar = dj; 
f aaaf (a) (Ax)? = df (see Sec. 12) etc. Then we deduce 
from (53), 

d? a dn c 
Madr itt Sly. path... (62) 
Truncating this formula we obtain (for small Az) approximate for- 


mulas (more and still more accurate): Af ~ df [accurate to a term 


of the order of (Az)*], Af ~ df + <a [accurate to a term of the 


order of (Az)’] and so on. 


§ 6. Intervals of Monotonicity. Extremum 


17. Sign of Derivative. Let us consider a function y = f (2). 
We assume throughout this section that the function and its deri- 
vative have no discontinuities. A sketch of the graph of the function 
is shown in Fig. 121. Since y’ = tan a (see Sec. 3) the function in- 
creases on every interval where its derivative is positive and decrea- 
„ses on every interval where the derivative is negative. In other 
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words, if the rate of a variable is positive the variable increases 
and if the rate is negative the variable decreases. 

Since the derivative should pass through the zero value in a con- 
tinuous transition from its positive values to its negative values 
there must be y’ = 0 at those points where an interval of increase 
borders on an interval of decrease (of course, the same takes place 
when there is a passage from negative values to positive values of 
the derivative). A point z at which f’ (xz) = 0 is called a critical 
point of the function y = f (x); the instantaneous rate of change 


Fig. 124 


of the function is equal to zero at such a point, that is critical points 
serve as if they were “points of instantaneous state of rest”. There are 
three critical points in Fig. 121: a, b and c. The corresponding values 
of the function are called its critical (or stationary) values. 

From what has been said it follows that to determine the intervals 
of monotonicity of f (x) it is necessary to indicate all the critical 
points of the function on the z-axis and then to investigate the sign 
of f'on each interval lying between neighbouring critical points. 
The intervals where f’ >0 will be the intervals of increase of f 
and the intervals on which f’ < 0 will be the intervals of decrease 
of f. In case the sign of f’ is the same in two neighbouring intervals 
these intervals form an entire interval of monotonicity; thus, in- 
tervals ZII and JV in Fig. 124 constitute a whole interval of increase 
of the function f (2). 

It is also obvious that the function f is constant on an interval if 
and only if f! (£) = 0 on the interval. Indeed, the function can neither 
increase nor decrease on its interval of constancy. 

{8. Points of Extremum. If the value f (£o) at some point £ = Zo 
is greater than all the “neighbouring” values of the function (i.e. greater 
than the values of f (z) taken for x lying sufficiently close to £o) 
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then the point xp is called the point of maximum of the function f (£0) 
(or we say that f (£o) has a maximum at the point x = Xp), and f (zo) 
is called its maximal value. The point of minimum and the minimal 
value of a function are defined in a similar way. Thus, the function 
shown in Fig. 124 has a point of maximum at z = a and a point 
of minimum at x = b. In other cases a function can have any other 
number of points of maximum or minimum and if a function is 
continuous its maxima and minima must alternate. For example, 
the function in Fig. 122 has three points of maximum and two points 


y 


y=f(x) 


Fig. 122 


of minimum; there is an infinitude of maxima and minima in Fig. 46 
whereas there are no’such points at all in Fig. 4A, 

A maximum or a minimum of a function is called an extremum 
which means “an extreme value” of the function. From Sec. 17 
it follows that points of extremum are such points at which the deri- 
vative changes its sign in moving through them from left to right. 
More definitely, if the sign of f' (2) changes from + to — while x 
moves through a point z = a in the positive direction then the 
function f has a maximum at z = 4a because the function increases 
on the left of z = a and decreases on the right of z = a (see Fig. 421). 
Similarly, the derivative changes its sign from — to + in moving 
through a point of minimum. f 

Tt follows that under the conditions specified in Sec. 17 all the 
points of extremum of a function are its critical points. This necessary 
condition for an extremum was, in fact, formulated by P. Fermat. 
As it is seen in Fig. 124 this condition is not sufficient, i.e. a critical 
point may not be a point of extremum, 

There are various sufficient conditions for the existence of an 
extremum but they are used more seldom than the necessary con- 
dition because in many concrete problems one often knows before- 
hand that the extremum must exist. Its approximate location is 
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also usually known and it is only its exact value that remains un- 
known. If the necessary condition indicates only one possible point 
of extremum in the above circumstances then the extremum is sure 
to be there. In case there are several extrema it is possible to find 
them and determine the intervals of monotonicity (see Sec. 17) 


simultaneously. 
Since the values of a function change very slowly near a critical 
point Fermat’s condition implies that if a point of extremum is 


determined with an error then the error of evaluating the corres- 
ponding extreme value is of higher order of smallness. Therefore 
it is usually convenient (if it is possible) to reduce a problem of 
determining some quantity to the problem of evaluating an extreme 
or even a stationary (critical) value of a certain function. Then even 
a rough determination of a point of extremum yields a good ultimate 
result. 

Conditions for an extremum can also be established on the basis 
of Taylor’s formula (see Sec. 15). Let us investigate a point x = a 
for a function f (x). It is seen from formula (49) that there is no 
extremum at x = a if f’ (a) 40 since a change of the sign of k 
results in the change of the sign of f' (a) h and therefore the sign 
of the difference f (a + h) — f (a) also changes [because the terms 
ee order of h? are negligibly small compared to f’ (a) h for small 

|l. 

If f' (a) = 0 and f” (a) #0 then, by formula (50), we conclude 
in like manner that there will be an extremum at z = a. Namely, 
there will be a minimum at z = a provided f” (a) >Q [since in 
this case f (a + h) > f (a) for small | h |] and.a maximum provided 
f” (a) <0. In case f'(a) = 0, f’ (a) = 0 and f" (a) £0 formula 
(51) implies that there will be no extremum at z = a. If f’ (a) = 0, 
f’ (a) =0, f" (2) =0 and f (a) 40 then an extremum exists 
again and so on. 

19. The Greatest, and the Least Values of a Function. Let a func- 
tion y = f (x) and its derivative be continuous on a closed interval 
a < x <b. Suppose it is necessary to find the greatest and the least 
values of the function. In Sec. 18 we considered extrema which were 
attained at interior points of the intervals. But here we must also 
take into account end-point extrema. For instance, the function in 
Fig. 123 has a minimum at the end-point z = a and a maximum at 
the end-point z = b but it also has two other extrema in the interior 
of the interval. Of course, the derivative of a function is not necessa- 
rily equal to zero at an end-point even if there is an extremum there. 

Further, it should be noted that in Sec. 18 we investigated relative 
extrema whereas now we are interested in the absolute maximum and 
minimum. Therefore in order to find the greatest value of the func- 
tion on the interval a < x < b it is necessary to determine all its 
maxima at interior points as well as its end-point maxima on the 
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interval and then to compare the corresponding maximal values 
with each other: the greatest of these maxima is just the greatest 
value of the function. The same procedure yields the least value 
of a function on a closed interval. To facilitate these calculations 
we can simply compare all the critical and end-point values of the 
function with each other: the greatest of the values will be the abso- 
lute maximum and the least one will be the absolute minimum. 

A continuous function f (x) may have a derivative f’ with dis- 
continuities. In such a case the transition from the increase of f 
to its decrease may take place not only at those points where f = 0 


`- Fig. 123 Fig. 124 


but also at some points of discontinuity of the function f(z). In 
order to determine the intervals of monotonicity of f we should 
investigate the sign of f’ and this can be carried out in the way the 
sign of f was determined in Sec. III.15. If the derivative has a dis- 
continuity at a point z = 4 and changes its sign in the process of 
moving through the point z = a the function f (æ) has a cuspidal 
extremum at x = a (see the cuspidal minimum in Fig. 35 and the 
cuspidal maxima in Fig. 406). A function f no longer changes slowly 
near its cuspidal extremum whereas it does change slowly near 
extrema attained at its critical points (see Sec. 18) where f = 

Thus, the comprehensive statement of the necessary condition for 
the existence of an extremum is the following: at a point of extremum 
the derivative vanishes or has a discontinuity. 

The conditions for an extremum based on Taylor’s formula no 
longer hold for the points of discontinuity of a derivative. Only 
the condition based on the change of the sign of a derivative remains 
true in such a case. 

If a function f (x) itself has discontinuities the points of discon- 
tinuity may happen to be the end-points of some intervals of mono- 
tonicity of the function even in those cases when the derivative has 
the same sign on both sides from the point of discontinuity. For 
instance, as it is seen in Fig. 424, y' > 0 everywhere for x a and 
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at the same time there are two intervals of increase of f: -coo < z < a 
and a < x < oo which cannot be combined in one interval. There- 
fore, in determining intervals of monotonicity we should as well 
indicate all the points of discontinuity of a function on the z-axis. 

It should be taken into account that a function having a discon- 
tinuity may have no upper bound and then, of course, the greatest 
value will not exist at all. The same difficulty may occur in investi- 
gating a function, even a continuous one, defined over an infinits 
interval. 

Now let us take the case when a function is discontinuous or define! 
over an infinite interval and bounded. Then it may be impossible to 
speak about the greatest value of the function which is attained 
in the ordinary sense. Then the problem should be treated in the 
sense of a limit. For instance, the greatest value of the function in 
Fig. 124 is understood as f (a — 0). Even a very small change of 
the argument near the value z = a results in a sharp change of the 
value of the function. Thus, the value f (a — 0) is “unstable”. In 
such circumstances it is unnatural to speak about the greatest value 
of a function. Therefore we introduce the notion of the least upper 
bound of a function. The last term means the greatest of all the 
values of the function and of all the limits of the function*. The 
notion of a greatest. lower bound is introduced in like manner. The 
least upper bound and the greatest lower bound of a function are 
denoted, respectively, as sup f (x) and inf f (x) (which are the abbre- 
viations of the Latin supremum which means “the greatest” and 
infimum which means “the lowest”). 

Example 1. Let the function y = f (z) = [os be considered 


over the whole z-axis. Neither the function nor its derivative 


r _ 2x (1-04) — 408 (1402) _ 9, 1—222—24 
i (aie ~~ Fate 
has discontinuities and therefore to find the intervals of monotonicity 
it is necessary to equate y’ to zero which results in the equation 
xz (1 — 2z? — xt) = 0. Hence, 
a =0; zt + 2—1 = 0; (2%)? + 222-1 =0 
and a = Ay? 
Only the sign + yields a real root and therefore z? = yai 


and zys =+ V2 —1 = + 0.644. Thus, the z-axis is divided 
into four intervals (see Fig. 125). Substituting the values z = —10; 
z = —0.1; x = 0.1 and x = 10 into y’ we get, respectively, the 


* That is the greatest number among all the values of the function and 


all the limits of convergent sequences which can be formed of values of the 
function. — Tr. 
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signs +, —, + and —. Hence, these intervals are, in succession, 
the intervals of increase, decrease, increase and decrease of the 
function. Consequently, the function has maxima at the points 
r = z and r = zand a minimum at az = 34. The maximal values are 


f (a2) = f (23) 1 “3 = Vti =1 207 
and the minimal one is f (%,) = 4. 


Besides, the “end-point” limits f(—oo) and f (+00) both are 
equal to zero since the numerator of f (x) is an infinitely large 


variable of the second order relative to z aS z —— +00 whereas the 
denominator is of the fourth order. Hence, the greatest value 1.207 
of the function is-attained at = 0.644 whereas the least value 


W 


Fig. 126 


(equal to zero) is attained only in the passage to the limit as z — +00. 
A sketch of the graph of the function is shown in Fig. 126. 
Example 2. Given a rectangular sheet of tin with side a, let it 
be required to cut a box of maximal volume (see Fig. 127 where the 
cutting lines are shown as continuous and the bending lines as 
dotted). It is clear that the solution of the problem exists but we 
do not know where the cut should be made (i.e. what z is) and what 
the volume will be. If we first take an undetermined z then the 
volume will be V = (a — 2z)? x and, according to the conditions 


of the problem, x must take a value between 0 and 5 Applying 
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the necessary condition for an extremum we obtain 
X L 2 (a — 22) (— 2) 2 + (a — 2a)? = (a — 22) (a — bx) = 0 


which implies x, = $ and z = = The conditions of the problem 


indicate that only z = E will do, that is this value yields the 


sought-for maximal volume which is equal to 
2 PARE 
Vmas= (a—2-5) 5g’ 

Example 3. Let us consider the problem of refraction of the light 
passing through the interface between two homogeneous (i.e. with 
the same properties at all points) and isotropic (i.e. with the same 
properties along all directions) media. Let us suppose first that 


a-2u 


Fig. 127 Fig. 128 


v, is the speed of light in the 
first medium; və is the speed of 
light in the second medium; mm 
is the interface between the media 


the interface is plane. Draw a plane through the ray of light (see 
Fig. 128) and take points A, and A, on the ray. Now we can use 
so-called Fermat's principle in optics which states that the ray of 
light propagating from A, to A, follows a trajectory such that 
takes the minimal interval of time for passing from A, to A, in 
comparison with all the possible trajectories connecting A, and A,. 
According to the principle the point M (A, and A, are regarded as 
fixed) must be located in such a position that the time interval 


tondan ye VS | VEFE 
4 Vo v Va 
should be as short as possible. Applying the necessary condition 
for the existence of an extremum we obtain 
dt x a—2z 


de VF v Vitaa 
from which it follows that 
z a—z singi _ Z a—t_ Yy 


= j = 


lwi lavo SiN Gy pe Ve 
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Thus, we have deduced the well-known law of refraction: the 
sine of the angle of incidence bears the constant ratio to the sine 
of the angle of refraction equal to the ratio of the speeds of light 
in both media. If now the interface is not. plane the law of refraction 
will remain the same since the phenomenon of refraction depends 
only on the properties of the media in an infinitesimal vicinity 
of the point of refraction and the interface can be regarded as being 
plane in such a vicinity. 

Thus we see that we have managed to deduce a physical law 
on the basis of solving an extremum problem according to a general 
physical principle formulated in terms of an extremum. Such a 
principle states that a certain physical quantity must have an extre- 
mal value in real circumstances. 

A more sophisticated investigation shows that in Fermat's prin- 
ciple (and in some other principles of this kind) the essential con- 
dition is not that the time taken by the light to pass a distance 
must be minimal or even extremal but that the time must assume 
a stationary value. In the latter form Fermat's principle can be 
deduced from the wave theory of light. 


§ 7. Constructing Graphs of Functions 


Differential calculus gives us a general method of investigation 
of individual peculiarities of the graph of a given function which 
enables us to construct the graph more accurately and considerably 
faster than by the primitive method of plotting separate points of 
the graph as it was done in Sec. 1.14. The determination of intervals 
of monotonicity of a function described in § 6 is an important example 
of the method of constructing graphs. Besides, there are some other 
techniques for investigating graphs which are also of use. They 
will be discussed here. 

20. Intervals of Convexity of a Graph and Points of Inflection. 
Let a function y = f (zx) have a graph of the shape shown in Fig. 129. 
We see that on the left of the point A and on the right of the point 
B the graph is convex upwards and it is convex downwards between 
A and B- (see Sec. 1.24). The points A and B at which the convexity 
changes its character (i.e. the upward convexity transits to the 
downward one or vice versa) are the points of inflection; the graph 
intersects the tangents at these points though it forms the zero 
angles with them. 

In order to find the intervals of upward and downward convexity 
note that the tangent to a graph turns clockwise as x increases on 
every interval where the graph is convex upwards (for example, 
for æ < ain Fig. 129) and therefore the slope of the tangent decreases. 
This slope being equal to y’, the graph is convex upwards or down- 
wards on the intervals of the z-axis where y', respectively, decreases 
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or increases. These intervals can be found by investigating the sign 
of y” in the same way as the intervals of decrease and of increase 
of y were investigated by determining the sign of y’ in Sec. 17. 


Fig. 129 


Therefore, the graph is convex upwards or downwards on those inter- 
vals of the x-axis where y" <0 or y" >0, respectively. The points 
of inflection correspond to values of x such that in moving through x 
the second derivative y" changes its sign and y” is equal to zero at the 
points of inflection. It is sup- 
posed here that y, y’ and y” 
have no discontinuities. If 
there are such discontinuities 
some of the points of discon- 
tinuity may be the end-points 
of certain intervals of con- 
vexity (upward or downward) 
and therefore all the discon- 
tinuities must be indicated on 
the z-axis while constructing 
the graph. 
21. Asymptotes of a Graph. 
f A graph of y =f (x) may have 
Fig. 130 vertical (i.e. parallel to the 
y-axis) and inclined (i.e. not 
parallel to the y-axis) asymptotes (see Fig. 130). There may be 
any number of vertical asymptotes, even an infinite number (see, 
for example, the tangent curve in Fig. 47), and they are determined 
in the following way: if | y |— œ as z— a (ais finite) then the 
straight line x = a is a vertical asymptote. 
There cannot be more than two inclined asymptotes corresponding 
to z —> oo and z —> —oo and they are found as follows: let a straight 
line y = kx + b be an asymptote of the graph of y = f (x) as 
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x —> oo, Then (see Fig. 130) the difference Ô = yas — ygr is equal to 
ô = (ka + b) — f (2) (63) 
and tends to zero as z—> co. Whence, 


Lis) seh eS k 
that is 
k= lim mE 
x00 z 


Besides, by (63), 

f (z) — kz = b — ô —> b 
that is 

b = lim [f (£) — ka] 

Each of these limits must exist and be finite, otherwise there wil? 
be no asymptote as «—> oo. If these finite limits do exist then the 
asymptote also exists since it is seen from the last equality that 
the value [f (cz) — kz] — b tends to zero as z—> œ, i.e. 


lim [f (z) — (kx + b)] = 0 


22. General Scheme for Investigating a Function and Constructing 
Its Graph. This scheme for a function y = f (x) consists of the follow- 
ing rules. 

(1) We find the domain of definition of the function, its points 
of discontinuity and zeros and then determine its intervals of posi- 
tivity and negativity. After that we investigate the behaviour of 
the function as its argument approaches the points of discontinuity 
and the end-points of intervals of definition (including the behaviour 
of the function at infinity). Further, we determine the asymptotes 
of the graph. We also find out whether the function is even, odd or 
periodic and so on. 

(2) We determine the points of discontinuity and zeros of the 
derivative and then find the intervals of increase and decrease of 
the function, its points of extremum and the extremal values. Fur- 
ther, we investigate the behaviour of the derivative in approaching 
its points of discontinuity, the points of discontinuity of the function 
itself (in case the function has finite jumps at these points) and the 
end-points of the intervals on which the function is defined (if these 
end-points are finite and the function has finite values at them). 

(3) We next determine the points of discontinuity of the second 
derivative and its zeros and then find the intervals on which the 
function is convex upwards or downwards and also the points of 
inflection. It is also useful to determine the direction of the tangent. 
at the points of inflection. 
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All the points thus found should be plotted in the coordinate 
plane and then the graph itself is constructed. The shape of the 
graph must reflect all the individual peculiarities of the behaviour 
> of the function. If the elements of the graph indicated above do not 
describe the behaviour of the graph clear enough it is desirable to 
plot several additional points by calculating the values of y for 
some specifically chosen values of z. It is also expedient to determine 
the direction of the tangent at those points after computing the 
corresponding values of y’. 

We shall represent, as an example, the investigation of the be- 
haviour of the graph of the function y = V/s — 2x”. In this case 
the domain of the function is the whole z-axis —o < z < 00; 
there are no points of discontinuity. Putting y = 0 we see that 
the function has two zeros: z; = 0 and z, = 2. Therefore there are 
three intervals of retention of the sign: —oœ < z < 0, 0< x< 2 
and 2 < x< oo, Substituting arbitrary values of the argument taken 
from these intervals we see that the function is negative on the first 
and the second intervals and positive on the third one. There are 
no vertical asymptotes. Determining inclined asymptotes in accor- 
dance with Sec. 21 we find (the reader should check it up!) that one 
and the same straight line y = z — a serves as an inclined asymp- 
tote both for z —> œ and «>» —oo. After computing the derivative 

ee. 3a2—4r = 32—4 
4 3 (—2222 3 (e@— 2)? x 
we see that it has discontinuities (approaches infinity) as «— 0, 


x —» 2 and vanishes at z = 3: Thus we have four intervals of mono- 


tonicity: —o < x < 0, Gere 4%) Secc2and2<2< oo, 
Substituting arbitrary values from these intervals into y’ we find 
that only the second one is an interval of decrease and all the other 
intervals are intervals of increase. Therefore the third and the fourth 


ones form an entire interval of increase. Hence, changes of the 
character of monotonicity occur at z = 0 (where there is a maxi- 


mum with the maximal value y = 0) and at z = á (where there 


ae 
is a minimum with the minimal value y= —2 j i= — 1.058). 

Computing the second derivative we obtain, after some transfor- 
mations, the expression 
td 8 

: YS ye ri 

(verify this expression!). The only discontinuities of the second 
derivative are the points z = 0 and z = 2 where the discontinuities 


DERIVATIVES, ‘DIFFERENTIALS, BEHAVIOUR OF FUNCTIONS 477 


of the first derivative are placed, and there are no zeros of the second 
derivative at all. Thus, there are three intervals inside which “the 
character of convexity” is invariable: —œ < z < 0, O<r@=<2 
and 2 < z< œ. Now we substitute arbitrary values taken from 
these intervals into the second derivative, and then the sign of the 
derivative shows. that the function is convex downwards on the 
first and on the second intervals and is convex upwards on the third 


one. Let us, in addition, calculate for x = —1 the values 
3/5 r 7 
=— = —1.44 d =— = = 1.12 
y y 3 and y 373 
for z=4, 
y=—1 and y'=—7=—0.33 


and for z= 3, 
3/0 r 5 ` 
= Poy d =——=11 
v=o fel maU a : 6 
The constructed graph is shown in Fig. 131 where those points 
which were calculated are depicted as circles (we leave it to the 


Fig. 134 


reader to check that all the individual peculiarities of the graph 
are indeed represented here). 

The disposition of the graph relative to its asymptote can be 
easily found on the basis of its asymptotic expansion, that is an 
expansion which holds for sufficiently large values of | x |. This 


12—0141 
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expansion, in its turn, follows from formula (60): 
1-2- Pq- (1-2) 
a STERE 
aea (-2)'4...]- 


=2—2— -Á + infinitesimal terms of higher order 
relative to = as |x|—> co (64) 


Hence, we have y < zx -4 = Yas for large z >0 and y > Yas 
for large z < 0 as |z | —> œ S denotes here the ordinate of the 
asymptote). Besides, from equality (64) it straightway follows that 

y — (2—5) TEA (65) l 
Hence, <5 we had not known the equation of the asymptote y = 


=f — Z before we could deduce it from (65). Therefore, we have 
established one more method of finding an inclined asymptote. 


CHAPTER V 


Approximating Roots 
of Equations. 
Interpolation 


§ 1. Approximating Roots of Equations 


1. Introduction. We shall discuss here some methods of calculating 
a numerical solution of an equation of the form 


f(x) =0 (1) 


where f is a given function. Such an equation may be algebraic in 
case the function f is algebraic or transcendental if otherwise. We 
shall call both types of equation (1) finite to distinguish them, for 
example, from differential equations etc. Here we shall represent 
only some of the most important methods of solving equations of 
form (1); for other methods the reader is referred to special courses 
on calculus of approximations. 

The process of numerical solution of equation (1) usually begins 
with finding a rough, approximate value of a root which is called 
the zeroth approximation (the initial approximation). If a certain 
physical problem is being considered such an initial approximation 
may be known from the real physical conditions of the problem. 
We can also begin with constructing an approximate sketch of the 
graph of the function f. If doing this we find that the function is 
continuous over a closed interval between a and b and assumes 
values of opposite signs at the end-points a and b then we are sure, 
by the properties of continuous functions (see Sec. III.14), that 
there exists at least one zero of f on the interval, that is equation 
(1) has at least one root there. Besides, there must be only one root 
of f on the interval provided the function f is monotonic between a 
and b. In the last case the root is separated from other ones. If we 
denote the unknown root by œ then q is sure to satisfy the inequality 
a<a<b. Different methods are used for further specification 
of the value æ (see Sec. 2). 

It is sometimes more convenient to rewrite equation (1) in the 
form q (z) =‘p (z) and then to find the intersection point of tho 
graphs of y = ọ (x) and y =  (z). Doing this one tries to break the 
left-hand side of equation (1) into two summands in such a way that 


12* 
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this should yield some well-known or, at any rate, simpler graphs. 
An appropriate substitution for the variable 7 may also be some- 
times of use. 

For instance, let us take the equation 


tan az? — bz? = 0 (2) 


where a and b are given numbers. The change az? = s reduces (1) 
to the equation 


tans=ks (k=) (3) 


The graphs of the left-hand and right-hand sides of equation (3) 
are shown in Fig. 132. It is clear that we are interèsted in the values 
s> 0 only. We see that equation (3) has an infinitude of roots 


Fig. 132 


so =O<s <s.<..., and therefore equation (2) also has infini- 
tely many roots. The dependence of the roots of (3) on k which can 
be easily seen in Fig. 132 defines the dependence of the roots of the 
original equation on the parameters a and b. We see that for k > 1 


there appears a new root lying in the interval 0 < s < 5 (why 


is it so?). 

We can easily derive an asymptotic expression for the solution of 
equation (3) valid for large n. For definiteness, let k < 1. Then 
Fig. 132 implies the desired expression sn = nm + + a 


(an > 0) where an +0 as n+ %; this can be rewritten (see the 
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notation of Sec. III.11) as sn = nm + = + o (1). If it is necessary 
to specify this expansion we can substitute it into (3) which results in 


tan (nx 4+ —an) =E (nx +5 an) 


2 } 
and after simple transformations we obtain 
COSAn = k (nx +5 — an) sin ap (4) 
From this we deduce 
An ~ Sin ee ~ os i.e. On =a +0 (=) 


at 
k (na +$— an) 
If further specification is desired then we may, for example, denote 


4 40, an = a (i) -o 0, which yields 


tcosa=k| a+ ($-2) t] sing (=a (t); «(0)=0) 


Now it is possible to put down several terms of the expansion of 
a (t) into Maclaurin’s series [of form (IV.54) but in powers of ¢]. 
The calculations which we leave to the reader give 
1 TN ie E VRSNE ONAR A 
Ta n e 2 ATAT. hapa) ihe 
1 1 
= kan ie le 


Now on the basis of formula (IV.60) we obtain an asymptotic 
expression for the positive roots of equation (2) for large n: 


Rahs 1 p— 
BMS ys Be TEI EE ti EA 
m= =a AR o Tg emt 


L) i 


ai t) 
1 


=(2) [i att] 


2. Cut-and-Try Method. Method of Chords. Method of Tangents. 
We begin with the cut-and-try method. Its scheme is the following. 
Let, for definiteness, f (a) < 0 and f (b) >0. We first take an arbi- 
trary value c between a and b and compute f (c). It should be noted 
that it is the sign of f (c) that is important here but not the value 
f (c) itself. Now let us suppose that we obtain f (c) >0. This means 
that we have “a shot over the target”, “a plus round”, and therefore 
a< a< c. Further, we take some value d between a and c and 


C= 


tol 
2 
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compute f (d); if f (d) < 0 then we have “a minus round”, i.e. d < 
<«a<ce and so on. The values c, d, ... may be taken more or 
less arbitrarily. It is better, of course, to choose them in such a way 
that the calculations should be simpler. At the same time it should 
be noted that if, for example, |f (a) | is much smaller than f (b), 
it is quite likely that œ is closer to a than to b and therefore it is 
better to take c closer to a and so on. 

The method of chords prescribes to take for the point c not an 
arbitrary point but the point of intersection of the chord drawn 
through the points M [a, f (a)] and N [b, f (6)] with the z-axis 
(see Fig. 133). In other words, we act as if we approximately replaced 
the arc of the graph by a line segment. This means that we are carry- 
ing out the linear interpolation which looks sufficiently justified 
provided the interval (a, 6) is not too large. In order to find the 
a í let us write the equation of the chord MN [see equation 
ITI.22)]: 


y—f (a) z—a 


f®)—f@ ba 
Now putting y=0 we obtain the corresponding value z= c: 


f(a)(b—a) Í (b) (b— a) 

TOO OO 6) 
' The procedure can be repeated several times if it is necessary (see 
Fig. 133). 

In the method of tangents (also called Newton’s method) we choose 
the point of intersection of the tangent line drawn to the graph 
through one of the end-points of the considered arc with the x-axis 
as the point c. The equation of the tangent shown in Fig. 134 has 
the form [see equation (IV.5)] 


y—f (b+) =f () @— 2) 


From this, putting y = 0, we derive 


c=a 


rO (6) 
The procedure described here may also be repeated several times 
(see Fig. 134). 

Newton’s method may be interpreted irrespective of its geometric 
meaning. Let us denote the zeroth approximation as zo and expand 
the left-hand side of (1) in powers of x — zp on the basis of Taylor’s 
ormula (IV.53); this yields the equation 


(a9) + LED. (@— a) + FO (@— a) +... =0 


If now we carry out the linearization, that is if we drop all the terms 
that are infinitesimals of orders higher than the first, we shall get 
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the linearized equation (1): 


f (z0) + f° (z0) (£ — to) = 9 


The solution of this equation is 


It can be taken as the first approximation of the root of equation (1). 
Thus we arrive at the same formula (6). The second approximation 
can be obtained from the first approximation by using the formula 


2% 


_ fla) 
f (z1) (7) 


and so forth. Newton’s method always leads to the aim provided 
the zeroth approximation does not lie too far from the desired root. 


The following modification of Newton's method is sometimes 
used: the denominators in formula (7) and in the formulas for the 


Fig. 133 Fig. 134 


further approximations are replaced by f’ (£o). This means, geo- 
metrically, that all the inclined lines in Fig. 134 are drawn so that 
they are parallel to the tangent at the original point N. The con- 
vergence of the modified method is a little worse compared with 
the original scheme but the calculation of each approximation is, 
naturally, simplified. 

The combined method is based on the following consideration. 
If the segment of the graph in question has no points of inflection 
and is not broken the method of chords and the method of tangents 
give the points lying on different sides of the desired root. If, for 
example, a graph is situated in the way shown in Fig. 135 then, 
beginning with the interval (a, b), we can determine the point a 
by the method of chords and the point b, by the method of tangents. 
This will result in a new interval (a;, b,) containing the desired 
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root a. Repeating the analogous procedure for the interval (a;, b;) 
we again obtain a new interval (az, b») containing the desired root 
etc. Thus, we obtain in succession the two-sided approximations. 
The approximation process is stopped when the required degree of 
accuracy is attained. 

Let us take, for example, the equation 


e+27—-3=0 (8) 


We shall consider the coefficients of equation (8) to be quite exact. 
The investigation of the derivatives indicates that the left-hand 


Fig. 135 Fig. 136 


side of (8) which we denote by f (x) increases from—oo to —22 


for x increasing in the interval —oo < t < — oh decreases to —3: 


for — + <x<0 and then again increases to co for0<2< œ. 


Besides, f (x) has only one point of inflection at z = — + (check it 


up!). Consequently, the equation has a unique real and positive 
root a. Since f (0) = —3, f (1) = —1 and f (2) = 9 (see Fig. 136) 
we have 1<a@< 2. Calculating in accordance with the cut-and- 
try method we obtain /f (1.1) = —0.459 and f (4.2) = 0.168, 
i.e. 1.4 <a < 1.2 (a crude estimation of a root is usually obtained 
by the cut-and-try method). Now putting a = 1.1 and b = 1.2 
we apply formulas (5) and (6) (i.e. we use the combined method): 


0.168 x 0.4 
b= 1,2 ~ eS erg 


6 72 
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Hence, we can put. œ = 1.174, this value being accurate to 0.001. 
If the accuracy is insufficient we can proceed to further calculations: 
f (1.174) = —0.003628 (i.e. we have a “minus round”, the calcu- 
lations being accurate to 10-6) and f (1.175) = 0.002859. -Taking 
now a = 1.174, b = 1.175 and calculating by the combined method 
we obtain the values a, = 1.1745593 and b> = 1.1745596 accurate 
to 10-7. Hence, with an accuracy of 0.000001 we can assume œ = 
— 4.174559. Note how fast the degree of accuracy increases! 

3. Iterative Method. The methods described in Sec. 2 belong to 
the class of iterative methods (or, in other words, to the class of 
methods of successive approximations). A characteristic feature 
of all these methods is the successive iteration of one and the same 
scheme during the calculation process. This uniformity, i.e. the 
repetition of one and the same process, has many advantages. In 
particular, it is very convenient when we use digital computers. 

The general form of the iterative method applicable to equation 
(1) is the following: the equation is rewritten in the equivalent form 


z = ọ (2) (9) 


Then we choose a certain value z = zp as the zeroth approximation. 
It is desirable, of course, that x should be as close as possible to 
the sought-for root if we have some information about it. The sub- 
sequent approximations are computed by the formulas z; = @ (xo), 
Zo = Q (zı), - - +» OF, generally, 

Tri = P (£n) (10) 
There can be two cases here. 

(1) The process may converge, that is the successive approxima- 
tions z, tend toa limit x as n — oo. In this case we can pass to the 
limit in formula (10) which yields x = @ (x). Thus, we see that 
xz =z is a root of equation (9). 

(2) The process may diverge, that is there can be no finite limit 
for the “approximations” thus constructed. But this fact does not 
necessarily imply that there is no solution of equation (9) because 
it might simply occur that the iterative process was constructed 
in an inappropriate way. By the way, it may happen that even.in 
the case of convergence we obtain some other solution (which may 
have no physical meaning) quite different from the desired root 
in whose vicinity 2) has been chosen. ; 

We shall demonstrate these possibilities by taking an example 
of a very simple equation which can be easily solved: 


r=5+1 (11) 


The equation has an obvious solution x = 2. If we put x) = 0 and 
calculate with an accuracy of 0.001 we shall obtain z; = 1.000, 
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£, = 1.500, ry = 1.750, z, = 1.875, z; = 1.938, ze = 1.969, z, = 
= 1.984, za = 1.992, zo = 1.996, zio = 1.998, zı = 1.999, zis = 
= 2.000 and z; = 2.000, that is the process has “practically con- 
verged”. 

If we take the equation 


T 
ti hi 


instead of (141) and assume x) = 0 then we receive the values tı = 
= 1.000, x. = 1.100, zy = 1.110, z, = 1.111 and z, = 1.141 accu- 
rate to 0.001; thus, the process practically converged after four 
iterations. 

If we solve equation (41) for x entering into the right-hand side, 
that is if we rewrite (11) in the equivalent form 


z =2r— 2 (12) 
and begin with x) =0, we shall get the sequence z; = —2, £, = 
= —6, z = —14 etc., that is the process will not converge. We 


could have forecast this result if we had observed that formula ( 10) 
implied the equality 


Ti — In = D (Zn) ae, (Tn-1) (13) 


that is z3 — 2, = ọ (z1) — @ (zo), £3 — tz = @ (z2) — ọ (zı) and 
so forth. In case a function changes more slowly than its argument 
or, more precisely, if 


lo (a) — p &)I<kir—7] (kt =const<1) (44) 


the distance between the successive approximations rapidly appro- 
aches zero and the iterative process converges. The smaller k, the 
greater the speed of convergence. Inequality (14) must be fulfilled 
for all z and x or, at any rate, near the sought-for root z of equation 
(9). We shall show in Sec. 4 that inequality (14) holds provided 
|g (2) | <k. 

We see that equations (11) and (12) are equivalent but generate 
different iterative processes. In other cases equation (1) may be 
rewritten in form (9) in many different ways, each of these ways 
generating its own iterative process. Some of these processes may 
happen to converge fast and therefore are more convenient, some 
of them may converge slowly and, finally, some of them may simply 
diverge. In particular, it is easy to verify that if we rewrite equation 
(1) in the form 


_ f(a) (62) 
TOFO) 


t=T 
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and begin with zọ = a [see formula (5)] this will yield the method 
of chords. Similarly, if equation (1) is rewritten in the form 


f (2) 
a EN (15) 
then we arrive at the method of tangents. 

There exists a comprehensive theoretical investigation of the 
problem of convergence of the iterative method. But in more com- 
plicated problems than the above ones it is often easier to compute 
several approximations. Then judging by the results we can draw 
the necessary conclusion as to the convergence of the process without 
giving any theoretical proof. If we 
see that an approximation differs 
from the preceding one by a small 
quantity (for instance, if their diffe- 
rence is less than the required 
degree of accuracy) we have every 
reason to stop the iterative process. 
At any rate, such a situation shows 
that the approximation satisfies z 
equation (9) with a good accuracy 
because | 2%, — tny | <h implies m 
| Zp SpA g (Zn) [< kh. 

4. Formula of Finite Increments. Fig. 137 
Inequality (14) can be verified by 
means of the so-called formula of finite increments we are going 
to deduce here. Let us suppose that a function y = ọ (zx) is conti- 
nuous over an interval a < z < b. Consider the graph of the func- 
tion on the interval and draw the chord MN connecting the end- 
points of the are (see Fig. 137). Let the derivative of the function 
also be continuous. Besides, we suppose, for definiteness, that there 
is a portion of the graph lying above the chord. 

Now we draw a straight line JZ parallel to MN and lying above 
the graph. Imagine that we begin to lower the line so that it should 
remain parallel to MN. Then there must exist a moment when the 
straight line touches the graph at a point p. Thus, there is at least 
one point lying on a smooth arc such that the tangent to the arc at this 
point should be parallel to the chord connecting the end-points of the 
arc. If now we equate the slopes of the chord and of the tangent we 
receive the formula 


LO-A gle), ie 9(b)——(a)=9' (c)(b—a) (16) 


where c is a point lying between a and b. Formula (16) is called the 
formula of finite increments (since the distance from a to b may ton 
be small) or Lagrange’s theorem. We must note that the value c 
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entering into formula (16) is not at all an arbitrary number for 
the given function and the given interval (a, b) although there may 
be several values that may serve as c. For instance, looking at 
Fig. 137 we see that we can as well take c* in place of c in formula 
(16) because the tangent at the point p* is also parallel to the chord 
MN. The exact value of.c is usually unknown when we apply for- 
mula (16) but it is often enough to know that c is placed somewhere 
between a and b. 

For example, suppose that | @’ (z) |< on an interval. Then 
applying formula (16) to two arbitrary points z and z taken from 
the interval we see that | (xz) —@ @) |<kla— z | for them 
[see formula (14)]. 

It also follows from formulas (13) and (16) that if the successive 


approximations are placed not far from the exact solution 2, and 
therefore ọ' (x) changes very little, the speed of convergence of the 
iterative process is approximately that of a geometrical progression 
with the ratio g’ (x). If the differences between successive approxi- 
mations formed an exact geometrical progression [as in example (11)] 
its first term would be a = 2; — zo and the ratio would equal q = 
Gre esa Therefore the sum of such a progression, that is the 
difference z — zo, would be equal to 


a Ma er at — Zo)? 
1—q 4 224 T 224—z9— tz 
z1 — Tto 
and therefore 
30 (x1 — z0)? xi — Totz 
= 2 a UF Er 
: ot 2x4—2y—22 224 —ty— T3 (17) 


In more complicated cases the successive differences of approxima- 
tions only resemble the terms of a geometrical progression. Then 
formula (17) does not yield an exact solution but makes it possible 
to omit several approximations and to get an approximate value 
of a root which may again initiate a new iterative process. 
Newton’s iterative process is of special importance. Indeed, the 
derivative of the left-hand side of (45) is equal to 
PANEI I AI 
f’2 f% 
and vanishes at z = x since f (z) = 0. Hence, by the preceding 
considerations, Newton’s iterative method converges faster than 
any geometrical progression with an arbitrary ratio. The rate of 
‘ this convergence may be easily illustrated by the following typical 
example. Let us consider the approximations obtained by Newton’s 


method for the root x = 0 of the equation z + z? = 0. These appro- 
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ximations are connected with each other by the relation 


Intt Za 
ane e E E T E 
To estimate the rate of convergence let us replace this approximate 
equality by the exact one. Then we get in succession 7 = Le, Ly = 
= 2? = zh, g = 2, = 29 etc. Generally, z, = 2°". The right-hand 
side z tends to zero as n —> œ, for | zo |< 1, faster than any 
exponential expression. 

5. Small Parameter Method. The small parameter method (the 
perturbation method), as well as the iterative method, is one of the 
most universal methods in mathematics. Here is the general idea 
of the method. Let us consider a problem involving some unknown 
quantities and, in addition, a parameter %. Suppose that it is not 
too difficult to solve the problem for a certain value a = Gp (this 
is the so-called unperturbed solution). Then the solution for œ lying 
close to œo (the so-called perturbed solution) may be in many cases 
obtained by expanding the solution in powers of œ — ap, with 
a certain degree of accuracy, by means of formulas similar to for- 
mulas (IV.49), (IV.50), (IV.51) etc. Obviously, the first term of 
such an expansion does not contain a — do and is obtained for 
œa = Go, i.e. it coincides with the unperturbed solution. The sub- 
sequent terms yield corrections to the unperturbed solution; these 
terms are infinitesimals of the first, second etc. orders (relative to 
a — do). These terms are usually computed by the method of unde- 
termined coefficients, i.e. the coefficients in (a — ao), (& — &o)? 
etc. are denoted by letters and the values denoted by the letters 
are then found on the basis of the conditions of the problem. This 
method gives a good result only for values of « which are close to 
ao. The smaller | % — ao |, the smaller the number of terms that 
should be computed to attain a desired accuracy. It is often con- 
venient to choose a parameter so that a» = 0; then the difference 
a — % = æ is considered small and this accounts for the term 
“the small parameter method”. The number of terms that must be 
taken may be determined by a method similar to the one used in 
the end of Sec. III.6. It should also be noted that an attempt to use 
the small parameter method for large |œ — a| may lead to principal 
mistakes because the dropped terms may be more significant than 
the retained ones in this case. 

Thus, the small parameter method makes it possible to obtain 
a solution of a problem that is formulated in terms which are close 
to the terms of a certain “main” problem provided, of course, this 
change of the formulation does not yield a principal, qualitative 
change of the solution. Even determining the first term containing 
a parameter often enables us to make some useful conclusions con- 
cerning the dependence of the solution on the parameter. 


2 
F2 En 
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Example. Let us solve the equation 
8—azrr?+1=0 (18) 
for small |æ | with an accuracy up to the term a? inclusively. In 
order to do this note that the value æ = 0 yields the equation «* + 


+4 = 0 which has an obvious root zo = —1. Therefore we put 
down the expression 
ty, = —1 + aa + ba? + ca? + infinitesimals of higher order 


a 
Substituting this expression into (18) and taking terms only up 
to the order of a we receive (check it!) 


(—1 + 3aa + 3ba? — 3a?a® — baba? + 3ca? + aa?) — 
— a (4 — 2aa — 2b? + aa’) + 1 + 
-+ infinitesimals of higher order = 0 
From this, equating coefficients in the same powers of œ, we derive 
8a —1 =0, 3b — 302+ 2a =0 and —6ab4+ 3c + a + 2b — 


— a? = 0. Now we find, in succession, a = N b= — i and c= 
= a Hence, we obtain the expression 

a a2 208 , 

UIE ES RI (t) 


for the root of equation (18) which is accurate to infinitesimals of 
higher order relative to « (namely, the error is of the order of at). 
Just the same result may be obtained by applying directly Tay- 
lor’s formula (IV.51) in which we change the notation a little: 
‘ibe dz vipa $ d?r 2 1 Bz 
t= a+ (42) 0+-3 (ter), +r (Gar) @ (20) 
Here the subscript “zero” points out that the value œ = 0 is sub- 
stituted into the corresponding terms. Now let us differentiate 


equality (18) with respect to œ in a manner similar to the one used 
in Sec. IV.11: 


822% a? 200 = =0, 
be t)* 488 2 tr —2a( $2) — ree Bh a 
o (E) +182 $5 Se +808 er —8 (Ge) — Or gar 
— 6a = —2a2 $Z = 0 
Substituting «=O and z= —í we derive 


3(#2),-1=0, e (h ta trha a 
r Ai s dz \2 a 
6 (2) —18 (SE) (Sas h+ (h-e Etel imp 
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from which we obtain, in succession, 


dx oe dr D 2 dex ie 
(Ths: ( ah TAC Rp (hie 27 
From this it is seen that formula (20) implies expansion (19) which 
is sufficiently accurate for small |œ |. 

The small parameter method is closely and directly related to 
the iterative, method of Sec. 3. We shall illustrate this connection 
by taking the same example (18). It is always convenient to have 
an unperturbed solution equal to zero. To attain this let us make 
the substitution z = —1 + y which yields 

—1 + 3y — 3y2+ y — a + 2ay —ay?+1=0 
i.e. 


y Za Zay ty? Fay—+y 
If now we carry out iterations beginning with the value yo = 0 and 
dropping the infinitesimal terms of order higher than the third we 
get the desired result after three iterations. It is also easy to verify 
that it is permissible to neglect those terms of each approximation 
which are infinitesimals of an order higher than the number of the 
approximation. 


§ 2. Interpolation 


6. Lagrange’s Interpolation Formula. As it was shown in Sec. 1.22, 
the process of linear interpolation consists in an approximate repla- 
cement of a given function y = f (x) by a linear function y = ax + b 
coinciding with f(x) at two points. Obviously, the accuracy of 
such an approximation may be increased by taking a polynomial 
of degree n of the form 


P (x) = P, (x) = ao" F aya” +... + anat + an 
in place of a linear function. 

The polynomial P, (z) approximating the function f (x) contains 
n + 1 parameters (i.e. its coefficients), and thus n + 1 conditions, 
in general, are needed to determine such a polynomial. Let us take, 
for the sake of simplicity, a polynomial of the second degree: 

P (a) = P3 (a). =ar bx; +. ¢ 
(the general case is investigated quite similarly). To choose such 
a polynomial we must set three conditions. These conditions are 


often taken in the following form: the polynomial should coincide 
with the function f (x) at three given points: 


P (z) =f (z), P (22) = f (z2), P (£3) =f (z3) (21) 


These three values are also regarded as known. 
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It is quite evident that there can be only one such polynomial. 
Indeed, if another polynomial of the second degree Q (x) satisfied 
conditions (24) the difference P (x) — Q (x) (which is also a poly- 
nomial of the second degree) would be equal to zero at x = x, 
x = z, and z = «3. This implies that the difference is identically 
equal to zero (why is it so?). Thus, all the coefficients of the diffe- 
rence are equal to zero, and therefore Q (x) = P (z). 

Lagrange’s idea is to look for a polynomial P (x) in the form 

P (z) = A (£ — 2») (£ — x5) + B (x — 2) (£ — z3) + 

+C (z — x) (£ — x2) (22) 
where A, B and C are some constants yet unknown. It is clear that 
this is a polynomial of the second degree. To find the constants 
A, B and C let us take conditions (21) and notice that substituting 
z = 24, Z = z and x = z; into the right-hand side of formula (22) 
yields only one nonzero summand while the other two vanish. 
Hence we obtain 


f (a1) = A (z4 — 22) (z4 — £3), f (42) = B (x2 — 2) (£2 — 2), 
f (2) =C (23 — T1) (£3 — z2) 
Now we find A, B and C from the latter relations and substitute 


them into (22). Thus we have deduced Lagrange’s interpolation 
formula 
y He if (c= 22) (1—23) 
f(z) y Pe z EA (71—32) (24 — z3) T 
i „y Er) (e—a) (ea) (2—22) 9 
+ FCs) Gay) eaaa) + Fs) Tes a) Wega) 8) 
For practical applications of the formula it is desirable that none 
of the differences z; — £s, z4 — x3 and £y — z, should be too small. 
(Think why this condition is important.) 
Conditions (21) may be replaced, for example, by the following 
three conditions: 
P (x) =f (zs), P’ (as) =f (z), P (22) = f (23) 
Then the polynomial P (x) may be looked for in the form 
P (a) = A (æ — 2») (£ — 2z, + za) + 
+ B (« — x) (£ — 2) +-C (£ — a)? 
instead of (22). (Find the coefficients A, B and C for this case!) 
7. Finite Differences and Their Connection with Derivatives. 
Before proceeding to our further investigations let us introduce one 


of the important notions of modern mathematics, namely, the notion 
of a finite difference. Let y= f (x). Then, with h given, the expres- 


sion 
An =f (z + h) — f (2) 
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is called a finite difference of the first order of a function f (the first 
difference of f). The expression 


f(a+h)—f (2) 


1 
z Any = h 


is called the first difference quotient or the first divided difference. 
It follows from the definition of a derivative (see Sec. IV.2) that 
for a sufficiently small h we have 
1 è c 
We Any © ¥ (24) 
or, more precisely, 


rege 
‘=lim—A 
4 h>0 h ys 


Let, for example, y=2*. Then 
Any = (z +h) — 23 = 32h + 32h? + h? 
+ Any = 32? + 32h + h? 
lim (+ Any) = lim (Ba? + 32h 4 W) = 32 y' 
h=>0 h-0 
We also indicate here the following obvious properties of diffe 
rences: 
An (ys + Y2) = Anyi + Ary» 
An (Cy) = CAny (C = const) 


We can also take a difference of the first differences, the so-called 
second difference: 


Aty = An (Any) = An If (œ + h) — f (2) = 
= [f (œ + 2h) — f (& + h)l — If (œ + h) — f (2) = 
= f (z + 2h) — 2f @ + h) + f (2) 


The second divided difference is defined in an analogous way: 


4 de f (w+-2h) —2f (+h) +f (x) 
h2 


4 
Lan (F Any) = ae An (Any) = Ge Sty = 


Since taking a divided difference with a small step is approxi- ` 
mately equivalent to differentiating, the second divided difference 
for a small step is approximately equal to the second derivative or, 
more precisely, 

1 


ary . f(e+-2h)—2f (+h) +f (2) 
SF) hg Sein ee eee 
riea Weare he 


(25) 


143—0141 
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Thus, in the previous example 
Ry = Ap (8a2h + 3xh? + h3) = 3 (1h)? h +3 (x + h) k? +h? — 
—32°h — Izh? — h? = bxh? + 6h’; 


lim sy Aky = lim (6z + 6h) = 62 = y” 
h-0 h->0 


The third difference Afy = A; (Ajy) and the third divided diffe- 
rence or Aly (which tends. to the third derivative y” when passing 


to the limit) etc. are defined similarly. 

It is especially convenient for computing the differences when 
a function is represented by a table with a constant step h. In case 
a table is given in general form (I.2) we can write Ay; = Yə — Yn — 
Ayo = Y3 — y2 and, generally, Ayr = Yas: — Yr. The subscript Æ 
in the expression Ay; now indicates the number of the difference 
and not the step because the step is regarded as fixed here. Further, 
A®y, = Ay. — Ayı, A?y2 = Ays — Ay2 and so on. For instance, 
let us take, for h=O.1, the table 


x | 10.0 | 10.4 | 10.2 | 10.3 | 10.4 10.5 | 10.6 | 10.7 


y | 4.00000 004 0 4.01284] .01703]4 .02119]1.02531]1 02938 
105Ay | 432 | 428 | 424 | 419 | 416 | 412 | 407 

105A2y —4 | Sine seo | en ee ee 

105A2y 0 |- | 2 |= —1 


(This fragment is taken from the table of logarithms. The values 
of the differences are multiplied by 100,000 in order to get rid of 
decimal zeros.) 

The smallness and the approximate constancy of the second diffe- 
rences in the above example indicate the smoothness of the process 
of change of the function and the absence of random “splashes” in 
the process. Such a smoothness may be manifested in differences 
of higher order and it always indicates the “regularity” of the change 
of a function. In case the step is not small or the values of the argu- 
ment are close to the points of discontinuity etc. the differences 
may not be small but, as a rule, a certain kind of regularity in their 
values can be noticed. At the same time random errors occurring 
in the table greatly influence differences of higher order and this 
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usually enables us to find the errors. Suppose that by mistake we 
wrote 4.01294 instead of 1.01284 in the table. Then the fourth line 
would look as —4, +6, —25, +7, —4, —5 (check it!) and the regu- 
larity would obviously be broken. That is why differences of an 
order higher than the second are rarely used when a table represents 
results of an experiment which was not carried out with high pre- 
cision. In such cases one often restricts oneself to the first differences. 

The difference yi; — yr is sometimes attributed not to the 


value zp, as above, but to the value meek which is naturally 
denoted as z; pis Then the difference is called central and is desig- 


nated by Bui 1 y = yn — yx. Dividing the difference by the step 
MES 

we obtain a divided central difference. The central differences of 

the second order Sky = ô, 1Y — 6 ayare formed similarly. They 


are again attributed to the “integer” y 

values of the argument etc. (of 

course, it is not the values z£. of 

the argument q that are integers but 

their numbers k; the same is with the 

“half-integer” values os etc.). ‘ 
9), 


It is seen in Fig. 138 that the value 
of the divided central difference which 
is equal to the slope of the chord BC 
is closer to the derivative (i.e. to the 
slope of the tangent at the point A) 
than the “simple” divided difference Fig. 138 
(which is equal to the slope of the 
chord AD). This assertion can be easily verified by Taylor’s series 
(IV.52) since the difference 


A , h)—y , h n hè ` w 
Ay y EEA yag H e+ 


is of the order of k for small || whereas the difference 


a vez) (2) 
GAT ET h 


is of the order of h? (check it!). (The estimation of orders of accuracy 
of other approximate formulas for a small step is carried out simi- 
larly.) Thus, it is better to compute the approximate values of 
a derivative by means of a divided central difference than by for- 
mula (24). A more precise method of approximate calculation of 
derivatives of any order will be given in Sec. 9 


—y' (z)= T (2)-+ aay” (t.e 


13* 
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Divided central differences for a small step are close to the cor- 
responding derivatives and resemble them in many respects whereas 
the differences themselves (not divided) are close to the corresponding 
differentials. For instance, formula (25) implies 


1 ” 
Tz hy = +a 
where |æ | < 4, i.e. æ is an infinitesimal as h — 0. It follows that 


Afy = yh? + ah? = y" (Az)? + ak? = 
== dy + ah? ( | ah? | g h?) 


Hence, in case y” ~ 0 the value of Ajy differs from that of d?y by 
an infinitesimal of higher order and Ay and dy are therefore equi- 
valent infinitesimals as k — 0 (see Sec. I11.8). 

The theory of finite differences was developed simultaneously 
with other basic branches of mathematical analysis. The first syste- 
matic representation of calculus of finite differences was given by 
Taylor in 1715. Finite differences are nowadays widely used in many 
theoretical investigations and in practical applications especially 
in connection with modern electronic computers. 

8. Newton’s Interpolation Formulas. If the distance h between 
neighbouring values of z for which a function f is given is constant 
we can use some formulas that are more convenient than formula 
(23). For example, suppose we know the values 


f (z0) = yo f(m)=y. Ff (te) =Y» fF @s)=Ys 


where z; = 2) +h, £a = 2%) + 2h and z, = 2 + 3h. Then the 
polynomial P (x) taking the same values for the appointed values 
of x will be of the third degree (see Sec. 6). Newton’s idea was to 
seek P (x) as a polynomial of the form 


È (z) = A + Bs + Cs (s — h) + Ds (s — h) (s — 2h) (26) 
where s = x — Zo. According to the condition there must be 
Yo = P (z0) = Plexo = A, y, = P(t) = P |s=n = A + Bh, 
Yo = P (24) = P|sman = A + B-2h + C-2h’, 
Ys = P (#3) = P\|sasn = A + B-3h + C-3-2h? + D -3-27 


Writing differences (see Sec. 7) for the left-hand and right-hand 
sides we obtain 
Ayo = Bh, Ay, = Bh+C-2h?, Ay, = Bh + C-2-2h? + 
+ D +3+2h8 
Forming differences a second and a third time we get 
Ayo = C-2h?, Ay, = C-2h? + D+3-2h?, Ayo = D-3-2h3 
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7 ; = — Ayo A2yo __ ASyo 
From this we find A = yo; B=% > GS Tht and D 3n 
Substituting these values in (26) and taking into account that we 
could start from any tabular value z, in place of zo we derive New- 


ton’s formula 
s A? 
f(e) © P(a)= ut Ame +s (GH) + 


tama) (x —?) eD 


where s = & — Th. 

Formulas for interpolation polynomials of other degrees are of 
a similar form. Increasing this degree we can pass to the limit as 
it was done in Sec. IV.16 and thus obtain an infinite series of the 
form 


J (2) — y+ Aue +See (£4) SS (G1) (G2) 40 
(28) 


The terms that are not put down in formula (28) contain differences 
of the fourth, fifth etc. orders and are therefore infinitesimals of 
the fourth, fifth etc. orders relative to the step h. This formula is, 
of course, truncated in practical calculations. The number of retai- 
ned terms must be chosen in such a way that the dropped terms 
should be negligibly small. It may turn out that it is impossible 
to attain this in case the step is too large or if we consider points 
lying near the end-point of the interval. In such circumstances for- 
mula (28) is inapplicable. 

Newton’s formulas (27) and (28) are easy to use when a function f 
is represented in a tabular way since in such a case its differences 
are calculated quite simply. They are especially often applied in 
the beginning of a table (when, for example, we take iy —=0;F 1.0; 
the first tabular value zo, and zo < £ < 2). We choose the degree 
of an interpolation polynomial P (x) considering the values of the 
differences. For example, if the third differences are very small then 
the last term in formula (27) is also small and can be dropped, that 
is we can restrict ourselves to a polynomial of the second degree. 
In formula (28) it is also possible to put k= 0, r < to if | x — tol 
is not large; this will result in a backward extrapolation of the table. 

Newton’s another formula 


t A2yp_-, t t 
f (a) = yni Ayn t oe 1 (7-1) — 
A3yk-2 t ft TSR 358 
wa i (gt) (G2) + (29) 


where £ = za — «can be deduced like (28). This formula is used, 
in particular, at the end of a table, for example, if £r, is the last 
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tabular value of the argument and th < £< 2y,,. This formula 
is also used for a forward extrapolation of a table. 

When interpolation is carried out in the middle of a table it is 
desirable to have a formula that takes into account tabular values 
lying on the left and on the right of the value z we are interested in. 
One of such formulas is Bessel’s formula which is obtained by taking 
the half-sum of the right-hand sides of (28) and (29): 


— Yk Yk s 1 A®y,1+A2y, s [s 
a r a a 


tma (ta) (7-4) - ae (30) 


+ 


where s = z — a». This formula provides a high accuracy; it was 
named after F. W. Bessel (1784-1846), a German astronomer, but 
it was in fact established by Newton. 

Interpolation formulas are also used for the so-called problem 
of inverse interpolation which consists in finding the values of the 
argument for given values of a function. Let us begin, for example, 
with formula (27). Regarding this equality as an exact one we can 
solve it for the second summand in the right-hand side which, after 
division by Ayp, yields 


S _¥— 1 E 5 fs Ayp s [s s 
h yp 2 Ay, h es Sue (74) (7—2) oo 

If y is given then in order to find s we can apply the iterative 

method (see Sec. 3). To do this we can choose (+ iT oS as the 
h 

zeroth approximation. Substituting this value into the righ t-hand 

side of (31) we obtain (+), and so forth. The process converges 

well for small h. 

It should be taken into account that when we interpolate a dis- 
continuous function or a function with a discontinuous derivative 
the accuracy of an approximation may decrease considerably near 
the points of discontinuity since the interpolation polynomial itself 
has no discontinuities. A discontinuity may be imitated by bringing 
nodes of interpolation very close to each other but it is usually 
preferable to carry out an interpolation process only on intervals 
lying between the points of discontinuity. 

9. Numerical Differentiation. Numerical differentiation is usually 
applied when a function whose derivative must be found is defined 
by a table. This can be carried out as follows: the function is replaced 
by a polynomial according to the methods of Secs. 6 and 8 and then 
the derivatives of the polynomial are taken as the approximations 
to the derivatives of the original function. For example, formula (27) 
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implies (check it!) 
4 ~ Ayr y yng s A oval s\2_. (+ 2 
1 (x) ~ h TTR (7 S+ (+) 2 r)+3] 


{Taking formula (28) with a greater number of terms we can get 
a more accurate result.) In particular, putting z = zr (i.e. s = 0) 
we derive 


n 1 Aty, A? 
fag (An) 


A more precise formula may be represented as an infinite series 
of the form 
r -1 (Ayn _ Ayr , AUr ASYR | 9 
AE EEA att tt) (32) 
In like manner we can deduce formulas for the derivatives of the 
second and subsequent orders. Formulas (29) and (30) (and other 
interpolation formulas) may also be used for these purposes. 
Let us, in particular, put down the following formula implied 
by formula (30): $ 


r 1 \ 
F (an) = gp {AmA g (Ayr + Aya) + 


HA (Ayna + Ayna) — «| 


The subsequent terms of the last formula are, respectively, of the 
first, third, fifth etc. orders, i.e. the coefficients decrease and the 
orders of infinitesimals increase here faster than those of the terms 
of series (32). 

If a table represents some results of an experiment then even 
a small error in a value of a function may lead (after the division 
by a small step) to finite or even a large error in computing a value 
of the derivative. The situation is getting still worse when we cal- 
culate derivatives of higher order. It is therefore desirable that the 
step of a table should be considerably greater (e.g. 10 times greater) 
than the possible error in determining the values of the function. 
The step should be still greater (e.g. 100 times greater) than the 
error when we calculate the second derivative and so on. As a result 
of these difficulties it is usually preferable to use other (empirical) 
formulas (compare with Sec. 1.30) instead of interpolation formulas 
when we differentiate empirical functions. Since empirical formulas 
are constructed by taking into account all the peculiarities of expe- 
rimental data they are considerably more stable with respect to 
random errors of an experiment. 

Interpolation formulas and formulas for numerical differentiation 
are treated in courses on approximate calculations. 


CHAPTER VI 


Determinants and Systems 
of Linear 
Algebraic Equations 


§ 1. Determinants 


1. Definition. The concept of a determinant arises when we inve- 
stigate systems of algebraic equations of the first degree. Let us 
first take the system of equations 

az + biy = dy) (4) 
at + bay = d,) 
in two unknowns z and y. Solving the system (we leave the calcu- 
lations to the reader) we get the answer 
pies diba— bida we aid — dyas (2) 
~ abg— bja, ’ ~ aby — biaz 
The expression a,b, — bia, is called the determinant of the second 


a, b 
order. It is designated by the symbol |”! k 


2 b|? 
ay by ¢ 
ab, — biag = 3 
192 1%2 a by ( ) 
This notation enables us to rewrite formulas (2) in the form 
aq bh a d 
` |d} bz az dz J 
Sy fay Sebel? y a O (4) 
az b a, by 
Let us consider an example of computing a determinant: 
j0 —8 


J=0-1 3)2=0+6=6 


2 1 

The same process applied to solving the system of equations 
ax + by + cz = d; 

az + boy + caz = d; 

ast + bay + caz = dy 


(5) 
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yields the fractions which have the denominator of the form 
aybecs — Cabs — b1đaC3 + dyCad3 + Cidabs — Crbads (6) 
The last expression is called the determinant of the third order and 
is designated as 
a b G 
a, by Co (7) 
az; bs C3 
Transforming expression (6) and bearing in mind notation (3) we 
derive the formula 


a by C1 
az bz Co |= a; (baz — Cobs) — b1 (a23 — C24) +- 
as b3 Cs 
bz Cz az. C2 a, bz 
-+ c4 (agbs — b243) = 41 hi —bh ices +e otos (8) 
which is of use for calculating a determinant. For instance, 
4 0 —2 
1 4 
> —1 > ATA 
Rea ayy ceg N toy] (9) = 
2 4:73 o 2 3 4 
3i 2 


S (12—41) 2 (1:11-33 =2+8=9-7 


Determinants of the fourth order are defined by analogy with 
formula (8): 


at a Gs s e ds eT 
ra by a d, =a,|b3 C3 d;|—b,|@3 ¢s ds|+ 
Ge. Da C4 dy by Ca dy ay Cy dy 
az b dz a bz Cz 
+ce;|az bs d| —dy| 4s bs C3 
a, by dy Ga 04 Cg 


(we suggest that the reader should carefully think over the structure 
of the expression entering in the right-hand side). Determinants of 
the fifth, sixth etc. order (and, generally, determinants of the nth 
order) are introduced in a similar way. 

2. Properties. We are going to describe the properties of determi- 
nants but we shall take only the determinants of the third order 
of form (7) although all these properties hold for determinants of 
any order. 
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A determinant of the third order of form (7) has three rows and 
three columns. It consists of nine elements (i.e. of the numbers 
yy) Oii o n Gala 

1. Interchanging two rows or two columns of a determinant is 
equivalent to multiplying the determinant. by —1. For example, 


a bh E Ce by - a 
az bə Ca|= —|C, b» ae (9) 
a; b3 c3 c3 b; az 


{we have interchanged the third and the first columns). This is 
proved by comparing both sides of (9) according to formula (8): 


ĉi by ay 
Se Gah bz ay Co Ay f Co by 
carr Ae 2 oa od, b 1 — âi == 
a c cs; b 
Canby ae 3 3 3 3 3 


= — C4 (bza; — dgbs) +- bı (coa3 — 203) — A, (Cob3 — boc) 


If we remove the parentheses here we shall obtain the expression 
equal to (6). This proves formula (9). 

2. A determinant having two identical rows or columns is equal 
to zero. For example, 


Ja b c 
Pa; by ¢.|=0 
a, bz Co 


(here the second row coincides with the third one). Virtually, if 
we olere hange the two rows then by property 1 we get —P = P, 
i.e. = 0: 

3. A common factor entering into all the elements of a row or of 
a column can be taken outside the determinant. For instance, 


l kh, Cy ay bi Cy 
az kb, ca|=k|as bs c|. ` (10) 
a3 kbi cs az bs Cs 


The proof is carried out by verifying the equality. 

4, A determinant having a row or a column consisting of zeros 
equals zero. In order to prove this assertion it is sufficient to put 
k = 0 in formula (10). 

5. If each element of a row or of a column (for instance, of the 
second row) can be represented in the form of a sum of two terms 
the determinant itself can be represented as a sum of two determi- 
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nants according to the formula 


ay by cy ag Dy c a b c 
EE ae ae i ABP AE oe 
ata, bytb, CatCa|=]|l bi -Ca |F| ba & 
a3 bg c3 a, bs Cs a} bg C3 


The proof is carried out by verifying the equality of both sides. 

6. Adding arbitrary numbers proportional to the elements of 
a row (a column) to the corresponding element of another row (co- 
lumn) of a determinant we do not change the numerical value of 
the determinant. Indeed, for instance, 


at kei bi ĉi ay by c; ke; bi -ĉi 


az} kez by ca|=|@2 bz Cai+ kes bg Co) = 
ayt+ke,; b3 ¢3 az bs C3 ke, b3 c3 
a b c & b c a b c 
=|a, ba Co|thk|e2 bz C2|=]|& bg C 
az bs C3 c3 bs C3 az b3 C3 


(in the calculations we have applied, in succession, properties 5, 3 
and 2). 

7. The value of a determinant does not change if each of its rows 
is replaced by the corresponding columns and vice versa, that is 


a O e a a Q3 
a2 be C2 | = by bz bz 
a3 b3 C3 r Co Cz 


(this is the operation of transposing a determinant). The proof is 
carried out by verifying the equality. 

3. Expanding a Determinant in Minors of Its Row or Column. 
First we introduce the notion of a cofactor of an element of a deter- 
minant. Suppose we choose an element of a determinant and then 
delete the row and the column to which the element belongs. Thus 
we get a determinant of lower order which is called the minor 
(minor determinant) of the element of a determinant. Now let us 
supply each minor with the sign + or — depending on the position 
the corresponding element occupies in the original determinant 
according to the following rule: we take -+ for the minor of the 
element standing in the left upper corner of a determinant (this 
element is sometimes called the origin of the determinant) and 
alternate the signs of other elements in chess-board order according 
to the scheme 
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The quantities thus obtained are called cofactors (or signed minors 
or algebraic adjuncts) of the elements of a determinant. For instance. 
the cofactor A, of the element a, [see determinant (7)] is equal to 
by Cg a, bı 


e 
az bs 


tc, 


Pare the cofactor C, of the element c, is equal to— 
3°38 


There is a general property of determinants which is formulated 
as follows: a determinant is equal to the sum of the products of the 
elements of any row or column of the determinant by their cofactors. 
For example, 


a by c 
a, bz C |= bB, + bBo + bB; = 
a3 b3 Cs 
a, Ce 1 C1}. aq ĉi 
= —b; + be bs 
3 C3 ag C3 Ay Co 


This representation of a determinant is called the expansion of the 
determinant in minors of its row or column (we have the expansion 
in minors of the second column in the example). We also say that 
the determinant is expanded in terms of the elements of its row or 
column. As in Sec. 2, the proof is based on verifying the equality. 

The properties enumerated in Secs. 2 and 3 are applied to eva- 
luating determinants. For example, let us compute the determinant 


PEO 2 —1 

3 yf 0 —1 
D= 

2 4 —1 0 

0 3 2 1 


Here we can apply property 6 of Sec. 2 and make all the elements 
of a certain row or column but one be zero. Then expanding 
the determinant in minors of this row or column we get only one 
nonzero summand because all other minors are multiplied by zeros. 
If, for instance, we want only the element occupying the second 
place in the third row of the determinant to be unequal to zero we 
should multiply the second column by —2, add it to the first column 
and after this add the second column of the resulting determinant 
to its third column. Then we obtain 


4 0 ood 1 ORS 4 

1 4 OF = fet Ei 

0 A =í ol O0 0 
—6 3 2 1 — 6 S5S55 1 
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(the two operations are usually carried out simultaneously, by “one 
step”). Now expanding in minors of the third row we obtain 


4 2 —1 
Diet| e E 
—6 5 1 


Here we can, for example, subtract the second row from the first 
ene which results in 


Dia 0 
a e e DEAE: EE a l 
—6 5 1 
Now if we expand D in minors of the first row we finally obtain 
Dera) 7 o =1.1—(—1)(—6)= —5 


In case the elements of a determinant are approximate numbers 
the method is used in a modified form. We shall illustrate this tech- 
nique by evaluating the determinant 


— 1.37 2.15 0.76 
2.34 —1.78 —4.32 
— 0.86 2.13 3.25 


Let us factor out the first element of the first row using the rule of 
a reserve decimal digit (see Sec. I.9): 


D 


4 —1.569 — 0.555 
D= — 1.37 2.34 —1.78 — 4.32 
— 0.86 2.13 3.20 


Multiply the first row by —2.34 and add it to the second row and, 
simultaneously, multiply the first row by 0.86 and add it to the 
third row: 

4 —4.569 —0.555 

0 1.844 —3.038 |= — 1.37 
0 0.784 2.773 


Let us repeat this procedure: 


ETET n Hoal 


0.781 2.773 


1 AGA 
D= —1.37 x 1.844 z 
" ee m 


4 ` — 1.647 


a tee 4 55 844 
4.37 X 1.8 lo 4.059 


|- 1.37 A = —10.3 


206 k INTRODUCTORY MATHEMATICS FOR ENGINEERS 


This method is also applicable to determinants of higher order. 
It cannot be used in case some of the determinants occurring in the 
process of calculating contain the number 0 (or a number which 
is close to zero and known with a low degree of relative accuracy) 
as its left uppermost element. To overcome the difficulty we can 
slightly change the method and begin not with the left uppermost 
element but with the element which is the greatest in its absolute 
value. This element (the principal element) should be factored out 
of the row (column) in which it is contained. (The element —4.32 
is the principal element in our previous example.) 

The fundamentals of the theory of determinants were introduced 
in 1750 by the Swiss mathematician G. Cramer (1704-1752). 


§ 2. Systems of Linear Algebraic Equations 


4. Basic Case. We shall limit ourselves to such systems in which 
the number of equations coincides with the number of the unknowns. 
We shall deal only with the systems of three equations, i.e. systems 
of type (5). For example, if we want to find y we should multiply 
the first of equations (5) by the cofactor B, of the element b, of 
determinant (7), multiply the second equation by B, and the third 
one by B, and then sum together the results. Doing this we derive 


(a,B, + aBa + 3B) x + (bB; + bBo + b3Bs) y + 
“+ (cB, -+ xB, + Ba) 2 = 4B, +4,B,+4,B, (11) 
But the expression inside the first parentheses is equal to 


lg Gye} 
åg a, C2 (12) 
a3 43 C3 
In fact, if we expand the determinant in minors of the second column 
we obtain the sum of the products of the elements a, dy, and ds 
by their cofactors, these cofactors in determinant (12) being equal 
to the cofactors of the elements b,, by and 6, in determinant (7), 
i.e. equal to By, Bo and Bs, respectively. Determinant (12), by pro- 
perty 2 in Sec. 2, is therefore equal to zero. By the same reason, 
the expression in the third parentheses of formula (11) also equals 
zero whereas the right-hand side of (11) is equal to 


a Sn 
a, dy Cy 
a; d} Cs 
The expression in the second parentheses in (11) is just equal to 


determinant (7) itself by Sec. 3. This determinant consists of the 
coefficients in unknown quantities in system (5) and is called the 
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determinant of the system; we shall denote it, for brevity, by the 
letter D. Thus, from (11) we deduce 


a d G 
Dy=\a, d} cz (13), 
as ds Cs 
In the same way we find 
ditbis ėi a b d 
Dz=|d, bı c| and Dz=ja, bg d (14) 
dą bs ĉs az bs ds 


Now let us first suppose that D #0, this is the basic case. As 
it will be shown in the end of Sec. X.7, in this case the system in 
question has a unique solution. From (13) and (14) we derive the 
formulas for the solution: 


ae ANE? a a cy a b dy 

d} b3 Cy a d Cy a, by d 

{dg bgi ics az d3 c3 lag __b3 ds 
T Dae oe, D ia Fa D 


Thus, every unknown quantity is equal to a quotient which has the deter- 
minant of the system in the denominator, the numerator being the 
determinant which is obtained from the determinant of the system by 
substituting the column consisting of the right-hand members of system: 
(5) for the column of the coefficients in that unknown. 

Let us take, for example, the system of equations 


Le 
22 +y—2=0 
g—dy+2=—2 


Il 


Suppose it is necessary to find the value of z. Then 


t 0 i 
2 1 0 
LS ale Sees aM Dh Rae A CaO) Dear cee SY eae eee 
SETTNA a AEE (4a EF a 
2 44 
4) PED 4 


In the same way we compute the remaining unknowns and it is 
only the numerators that should be evaluated since the denomina- 


tors equal D = 9. 
The above formulas are called Cramer’s rule. If we apply the 


formulas to system (1) of two equations in two unknowns we obtain, 
formulas (4). 
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5. Numerical Solution. The application of the formulas given 
in Sec. 4 is inconvenient in case the coefficients of equations are 
approximate numbers and, in particular, if the number of equations 
is large. 

There are many methods that can be used in such cases. Here we 
shall represent two of them. We again take system (5) as an example 
but the methods are applicable to systems with any arbitrary num- 
ber of unknowns. 

Gauss’ method (the method of elimination) named after K. F. Gauss 
a famous German mathematician (1777-1855), who also obtained 
fundamental results in astronomy and geodesy, is analogous to 
the method of evaluating determinants in the end of Sec. 3. 

We divide the first equation of system (5) by a, which results in 


z+ by + cz = d; (15) 


(the primes designate the new coefficients here). Multiply equation 
(15) by —a, (—as) and add it to the second (third) equation of the 
system. This eliminates z and yields the system 


biy + ciz = d; | 
by + ce = d; 

Now we divide the first of equations (16) by b, and obtain 

y+ oe = d; (17) 


Multiplying the last equation by —b, and adding it to the second 
of equations (16) we derive the equation of the form 


cz =d; (18) 
that is y is also eliminated. From (18) we find z; substituting it 
into (17) we determine y; then substituting y and z into (15) we 
finally evaluate z. 

As in Sec. 3, if at some stage of the calculations the left uppermost 
coefficient turns out to be considerably smaller than others in its 
absolute value the method should be modified, that is we should 
eliminate the unknown whose coefficient is the greatest in the abso- 
lute value by dividing the corresponding equation by the coefficient. 

The iterative method (compare with Sec. V.3) is applied to system 
(5) in the following way. The system is rewritten in the form 


z = Ox + Biy + Yaz + ôi | 


(16) 


ll 


Y = Qat + Bay + Vz + 5, 
zZ = Qat + Bay + Yaz + 65 


(19) 


Then we choose certain values £ = 2, y = Yo and z = zo as the 
zeroth approximation. The values are substituted for x, y and z, 
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respectively, into the right-hand sides of system (19) and thus the 
first approximation zı, y, and z is obtained. Substituting again 
zı, y, and z, into the right-hand sides of (19) we get the second appro- 
zimation and so forth. Generally, the (n + 1)th approximation is 
expressed in terms of the mth approximation by the formulas 


Ynt1 = AaTn + Bayn + Yazn + Ôe 


Troi = Tn + Byn + Vin + ôi 
(20) 
Zn+1 = Alp + Ban + Vs2n + 53 


If the process converges, that is the successive approximations 
have limits as m— co (Zn >T, Yn — yand Zn —> z), then passing 
to the limit in formulas (20) as n — œ we see that z =z, y =y 
and z =z form the solution of system (19). 

Just as in the examples given in Sec. V.3, the smaller the absolute 
values of the coefficients in the unknowns in the right-hand sides 
of the system, the faster the convergence of the iterative method for 
system (19). Some more precise tests for the convergence will be 
given in Sec. XVII.18. 

System (5) is solvable in the simplest way in case it has the ¢ri- 
angular form: 


at d; 
a,x + bay d l 
a,x + bay + c33 = d; 


In this case we immediately find z from the first equation of the 
system; substituting it into the second equation we obtain y and 
then determine z after substituting x and y into the third equation. 
Gauss’ method described above is essentially the method of reducing 
general system (5) to the triangular form (15), (17) and (18). 

Different useful rules for numerical solution of systems of linear 
algebraic equations can be found in [3], [10} and [13]. 

6. Singular Case. If the determinant of a system is equal to zero 
we shall say that there is a singular case. As it will be shown in 
Sec. X.7, in such a case there may be one of the following two pos- 
sibilities. 

1. The system is inconsistent (that is contradictory) which means 
that it has no solution. For example, the system 


Il ll 


t 


zr+2y—z=1 
ry = 3h 
8a +y—2=5 


is just of this type since adding together the first two equations we 
arrive at a contradiction with the third one. 


144—0141 
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2. The system has infinitely many solutions. In this case the 
equations of the system must be dependent, i.e. one of the equations 
is the consequence of the others. For example, the system 


z+2y—z=1 
2 4 3} (24) 
BSa-+y —z=4 


belongs to this type. The third equation here is implied by the 
first two since it is the result of their addition. The third equation 
may therefore be discarded, i.e. it is permissible not to take it 
into account. To find the general solution of the system, i.e. the tota- 
lity of all its solutions, we rewrite the first two equations in the form 


g Wee NE) 
2r — y= 3 
whence we easily find z = its í ya tte and z =z. The 


variable z remains arbitrary here. Making z assume all the possible 
values we shall obtain the infinitude of all the solutions. For example, 


putting z = 0 we obtain the solution z = í oy i and z = 0; 


if z = 3 the corresponding solution is z = 2, y = 1 and z = 3 etc. 
[these are particular solutions of system (21)]. 

The above examples are typical in a certain sense. It turns out 
that in case the determinant of a system is equal to zero there will 
always exist one or several relationships between the left-hand 
sides of the system. If the same relationships are also fulfilled for 
the corresponding right-hand sides the system will have infinitely 
many solutions; if otherwise there will be no solutions at all. 

All the possibilities can be visually illustrated by system (1). 
As we already know from Sec. II.9, each of the equalities (1) defines 
a straight line in the a, y-plane and thus it is the point of intersec- 
tion of the two straight lines that is sought for by solving the system 
of equations. The condition D=+0 can be rewritten as Bat; 
it is easy to verify the geometrical meaning of the condition: since 
these straight lines are not parallel they have only one point of 
intersection. If D = 0 the straight lines are parallel. Then there 


may occur one of the following two subcases: if i 
2 2 


2 
then the straight lines have no common points at all, that is system 


d ; F 3 
(1) is contradictory; if Ho pat then the straight lines simply 
coincide and the equations of system (i) are equivalent, i.e. there 


exists an infinitude of solutions (the whole “straight line of: solu- 
tions”). „ü f aoto ii OTRE EIE 
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The indicated complications in the case D = 0 lead to some prac- 
tical difficulties even when the determinant of the system is unequal 
to zero but is very small because in such circumstances the solution 
is obtained with a low accuracy. Therefore one should try to avoid 
dealing with systems having such determinants. 

The so-called system of homogeneous linear algebraic equations 
(that is a system whose constant terms are all equal to zero) con- 
taining m equations in m unknowns is an important special case. 
For instance, if n = 3 such a system is put down as 

ax + by + cz = 0 
at + bay + c = a} 
age + bay + cyt = 0 


Of course, such a system always has the zero solution (the trivial 
solution) x = 0, y = 0 and z = 0. It is often important to find out 
whether there exist other, nonzero, solutions. It is easy to answer 
this question on the basis of the foregoing discussion. If the deter- 
minant of the system D 0 then, by Sec. 4, there must exist a 
unique solution and therefore there cannot exist a nonzero solution. 
But if D = 0 then, by the beginning of this section, the system has 
infinitely many nonzero solutions since a homogeneous system can- 
not be inconsistent. The nonzero solutions are found in the same 
way as for system (21). 

Thus, discarding, for definiteness, the third equation and taking 
an arbitrary value z = ¢ we arrive at the system of equations of 


the form 
ax + by + oz = at 
AEII ith Cokie? 
ae 


(22) 


Solving it according to the rules of Sec. 4 (of course, if its deter- 
minant A+ 0) we obtain 


Diy Pa eg 
0 by Co t bi cy 
ai t 0 4 y ba cz vai a by 
T A Ta A aoe (A= az |) 
where the notation 4 = C is introduced (C is an arbitrary con- 


stant). Similarly we derive y = CB, and z = CC;. If we had dropped 
the first equation instead of the third one we should have obtained 
in the same way the general solution of system (22) in the form 
z = CA,, y = CB, and z = CC. 

It is sometimes necessary to investigate a system of linear algeb- 
raic equations in which the number of the equations and the number 
of the unknowns do not coincide. ‘The general theory of such equa- 
tions is outlined in Sec. XI.5. 
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CHAPTER VII 


Vectors 


§ 1. Linear Operations on Vectors 


4. Scalar and Vector Quantities. Scalar and vector quantities 
differ in the following aspect: whilst the former are completely 
characterized by their numerical values relative to a chosen system 
of measurement units (such quantities as temperature, work, den- 
sity etc.), the latter have, in addition, certain direction in space 
(such quantities as force, velocity etc.). All the quantities we studied 
before were scalars and it was therefore permissible not to use the 
word “scalar”. But when we consider both scalar and vector quan- 
tities it is essentially important to take into account the nature 
of the quantities. A vector quantity, or, simply, a vector, can be 
represented by a line segment in space if we choose a certain unit 
of length. For instance, if we intend to represent forces we can assume 
that the segment of 2 cm represents the force of 1 kg and the like 
(see Sec. 1.4 on this question). The line segment representing a vector 
is directed (oriented), that is its origin and its terminus must be 
indicated. The direction of a vector is usually indicated by an 
arrow. Diverting a vector from its direction we obtain the absolute 
value (modulus) of this quantity. Thus, the absolute value of a vec- 
tor is a scalar which has a dimension of the quantity in question 
and is always positive with the only exception for the zero vector 
(see Sec. 3). The absolute value of a vector is also called its length. 
For example, let us take a vector representing a force. When we 
speak about the vector we mean that its length has the dimension 
of force, that is the measurement of the length of the representing 
line segment in the chosen scale units yields the magnitude of the 
force. In mathematics vectors are usually regarded as dimensionless. 
The absolute value of such a vector is a dimensionless (abstract) 
number. Vectors are usually given in bold face or designated by 
arrows (see Fig. 439). The absolute value of a vector is denoted by 
the same letter but in ordinary print or in bold face with vertical 


bars (the sign of modulus): AB = a = |a |. 
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Thus, to define a vector means to define its absolute value and 
its direction in space. Accordingly, every vector can be translated 
(that is it can be transferred in such a way that its direction in 
space should remain parallel to the original direction) to any place. 
Hence, the origin of a vector (“point of application”) can be any- 
where. In other words, two vectors are regarded as being equal if they 
have the same absolute values, are parallel and similarly directed 
(see Fig. 140). 

The freedom of translating a vector is sometimes restricted. For 
instance, the origin of a certain vector can be fixed. Such vectors 
are called localized or bound vectors (a radius-vector mentioned in 


B a 
a=AB b 
A 
Fig. 139 Fig. 140 
a=b 


Sec. 9 presents an example of such a vector). Further, there can 
exist a certain straight line in which a vector must lie. Then the 
vector is said to be a sliding vector. The vector of angular velocity 
of a rotary motion which lies on the axis of revolution is an example 
of such a vector. If the parallel translation of a vector is unrestricted 
the vector is called a free vector. 

2. Addition of Vectors. Linear operations on vectors are the ope- 
rations of adding vectors together and of multiplying a vector by 
a scalar (and the operation of subtraction which is, of course, con- 
nected with the addition). Generally, a quantity can be considered 
to be a vector if and only if the above operations are performed in 
accordance with the rules described further in Secs. 2-4. 

The addition of two vectors is performed according to the well- 
known parallelogram law in mechanics which is the rule of adding 
forces and velocities. For example, if it is required to add together 
two vectors a and b they are applied to a common origin and then 
a parallelogram is constructed on them (see Fig. 141). A vector 
coinciding with the diagonal of the parallelogram whose origin 
is that of a and b is, by definition, the sum a + b. This construc- 
tion straightway implies that a +b=b + a, that is the commu- 
tative law holds for the addition of vectors. 

The opposite sides of a parallelogram being parallel and equal, 


aes 
the vector DC in Fig. 141 is also equal to b. This implies one more 
rule of adding vectors: the origin of the second vector is placed 
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at the terminus of the first vector. Then a vector closing the triangle, 
that is the vector whose origin is that of the first vector and whose 
terminus is that of the second vector, is the sum of the vectors 
(Fig. 142). If now it is necessary to add a third vector to the above _ 
sum we must put the origin of the third vector at the terminus of 


h 


a+b 


Fig. 141 Fig. 142 
o=a+b 


the second one and take the closing vector again etc. The general 
rule of adding together any number of vectors is- illustrated in 
Fig. 143. It follows from Fig. 144 that the associative law also holds 
for a sum of vectors: a + (b + ce) = (a + b) + c. The commutative. 
and associative laws imply that the order of summation and the 


way of bracketing do not affect a sum of any number of summands. 
For example, 


(a+b) + (e+ d) = I(b + d) + cl + a= 
= [e + (a +d] +b 
and the like. 


We must underline that we cannot add together vectors of diffe- 
rent dimensions and that it is impossible to add together a vector 


‘at+(btc)=(ath)+e 


Fig. 143 Fig. 144 


and a scalar. Besides, we shall not consider the comparison of vec- 
tors in our course.. This means that there will be no positive and 
negative vectors or inequalities of the form a > b etc. But of course 
we can compare absolute values (lengths) of vectors. At the same 
time we must not be surprised that the absolute value of a sum of 
vectors may happen to be, for example, less than the absolute value 
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of each summand. Actually, vectors are added not as numbers 
but according to the parallelogram law of forces, and the resultant 
of a system of forces can be smaller than each of the forces. 

We conclude by pointing out that there is a consequence of 
Fig. 143, namely, 

jatb+e+d/<lal+/b/]+lel+ 14] 
The equality will be here only if all the vectorial addends are of 
the same direction; in such a case the polygon of vectors degenerates 
into a straight line. 

3. Zero Vector and Subtraction of Večtors. A vector whose ter- 
minus coincides with its origin is called the zero vector or, simply, 
zero. Its absolute value is equal to zero whereas all the other vec- 
tors have positive absolute values. The direction of this vector is 


-—-—_— a 
I 
Er 
a l 
b rb BS af -13a 
a -a l 
a 
Fig. 145 Fig. 146 Fig. 147 


undetermined, i.e. any direction may be ascribed to it. We can 
therefore regard the zero vector as parallel (perpendicular) to any 
vector. It is denoted as 0 and its role in an operation of adding 
vectors is similar to that of the number zero in adding numbers. 
In fact, it is apparent that a + 0 = a. 


Let a vector a = AB be given. Then the vector BA is called the 
negative of the vector a and is denoted as —a (see Fig. 145). Obvi- 
ously, a + (—a) = 0. 

To subtract a vector means to add its negative. It follows that 

b + (a — b) = b + [a + (—b)] = a + [b + (—b)] = 
a+0=a 


This corresponds to the usual definition of a difference. The geo- 
metrical interpretation of the rule of subtraction is shown in 
Fig. 146. 

A Multiplying a Vector by a Scalar. The product Aa = ad of 
a vector a by a dimensionless scalar (number) 4 is defined as follows: 
if A >0 then the product is a vector which is obtained from a as 
it is stretched A-fold without changing its direction; if A < 0 then 
a must be stretched | A |-fold and its direction must be replaced 
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by the opposite direction (see Fig. 147). Further, + = (=) a. These 
definitions imply the following simple properties: 


4. (—1) a = —a; 

2. 0a, 08 

3. 40 = 0; 

4. (A+ u) a = ña + pa; 

5. A (a + b) = Aa + Ab; 

6. A (pa) = (Ap) a; 

Ti A =a; 

8. If n is a positive integer then na =a + a +... +a. 
a a e ae 


n times 

These properties enable us to perform linear operations on vectors 
and transform algebraic expressions containing vectors in the same 
way as it is done with numbers. The properties are proved in an 
obvious way. For instance, Fig. 148 demonstrates the proof of pro- 
perty 5 [Aa + Ab = àc = À (a + b)] for A >0. 

If a scalar A has a dimension the product Aa is defined as a vector 
whose absolute value is equal to |A ||]a | and which is parallel 
to a and directed like a if A >>0 and directed contrary toaifA< 0. 


Fig. 148 7 Fig. 149 
c=a 4b, Ac=Aa-+AD c = ħa + ub 


All the enumerated properties remain true for this case as well. 
Further, for the sake of simplicity, we shall regard all vectors and 
scalars as dimensionless unless otherwise stated. 

5. Linear Combination of Vectors. Suppose we have several vec- 
tors, for example, three vectors a, b and e. Then every vector of the 
form d = àa + ub + ve where 4, p and v are scalars is called 
a linear combination of the vectors a, b and c. The vector d is also 
said to be expressed linearly in terms of a, b and ¢ which means that 
d can be obtained by linear operations on a, b and ¢. As examples 
of such linear combinations we can take 


a-+ 2b—3ce, ate tat obt+ se, 0=0-a+0-b+0-e etc. 


a 
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A given system of vectors is called linearly dependent if one of the 
vectors is expressed linearly in terms of the rest. If otherwise the 
vectors are called linearly independent. 

Two vectors are linearly dependent if and only if they are parallel 
to each other. Certainly, it follows from the definition (see Sec. 4) 
that b = Aa implies b || a. Conversely, if two vectors are parallel 
then they are linearly dependent because we can always take such 
a coefficient of extension A that after increasing the length of one 
of the vectors A times the extended vector should coincide with the 
other vector (in case the parallel vectors are of opposite directions: 
the coefficient A is negative). 

Three vectors are linearly dependent if and only if they are parallel 
to a common plane. Virtually, let e = àa + wb. Let us translate the 
three vectors so that their origins coincide. Draw a plane through 
the vectors a and b (the plane P in Fig. 449). Then the vectors ha 
and wb will lie in the plane P and therefore their sum, that is c, 
will also lie in the same plane. Consequently, the vectors a, b and ¢ 
were parallel to the plane P in their original positions. Conversely, 
let it be known that the vectors a, b and ¢ are parallel to a common 
plane P. Fig. 149 shows the representation of ¢ in the form of linear 
combination of a and b in case aL b. Such a representation is called 
the resolution of a vector in a plane into components with respect to two 
given non-parallel vectors. As for the case a || b, the preceding para- 
graph implies that one of the two vectors a and b (for instance, the 
vector a) is expressed linearly in terms of the other vector (for in- 
stance, in terms of b). Therefore a, b and c are linearly dependent 
because a is expressed in terms of b and c. 

Four or more vectors are always linearly dependent. Indeed, let 
us take four vectors a, b, e and d. Translate them to a common ori- 
gin. If after this the vectors a, b and ce lie in a common plane then, 
in accord with the preceding paragraph, one of them is expressed 
linearly in terms of the rest etc. (as in the end of the preceding 
paragraph). Let now a, b and ¢ not lie in a common plane after they 
are translated to a common origin (see Fig. 150). Then we can draw 
a straight line parallel to the vector e and passing through the 
point D (the terminus of the vector d), the point C being the point 
of intersection of the straight line with the plane in which the vec- 
tors a and b lie. Finally, we’ draw a straight line parallel to the vec- 
tor b and passing through the point C. This straight line intersects 
the straight line containing the vector a at a point B. Then we can 
write 


a- AD = AB + BC + CD = Ma + ub + ve (1) 


Such a representation is called the resolution of a vector into com- 
ponents with respect to three given vectors which are not parallel to 
one and the same plane. We also call it the resolution of a vector into 
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components along three given axes (these axes are denoted as ll, mm 
and nn in Fig. 150). Representations of this type are utilized in 
theoretical mechanics and other branches of science for resolving 


Fig. 150 
= ħa + pb + ve 


forces and other vector quantities into components along three 
given directions. Each of the summands Aa, pb and ve is called a 


i 


Fig. 1541 
a= ay + a; 


component of the vector d along the corresponding direction. A com- 
ponent along an axis depends not only on the direction of the axis 
but also on the directions of other axes. At the same time a compo- 
nent does not depend on the choice of positive directions for given 
axes. 
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Another type of representing a vector which is sometimes used 
is shown in Fig. 151. Here a given vector is resolved into compo- 
nents a, and a; where a, lies in a given plane and a; is parallel 
to a given axis. 

Resolution (1) can be performed uniquely. Indeed, if another 
representation d = ma + pub + ie existed we should equate their 
right-hand sides and deduce the relation (à — A) a + (u — p) b+ 
+ (v — vı) ¢ = 0. This would imply that the vectors a, b and ¢ 
would be linearly dependent (why is it so?). 

A system of linearly independent vectors used for resolving other 
vectors into components with respect to the system is called a basis. 
The foregoing discussion implies that any two non-parallel vectors 
can be taken as a basis in their plane and that any three vectors 
non-parallel to one and the same plane can be taken as a basis in 
space. If a, b and c form such a basis then the numbers 4, w and v 
entering into representation (1) are called the coordinates of the 
vector d with respect to the basis a, b, c. If a basis a, b, ¢ is given 
the coordinates 4, p and v are uniquely determined by the vector d. 
Conversely, the vector d is uniquely determined by its coordinates 
A, u and v. 


§ 2. Scalar Product of Vectors 
ase 
6. Projection of Vector on Axis. Let a vector a = AB and an 


axis l be given (see Fig. 452). The projection proj; a of the vector a 
on the axis l is the length of the segment A'B’ connecting the feet 


Fig. 152 
Note that the drawing is spatial! 


of the perpendiculars drawn from the points A and B to the axis k 
This length is taken with the siga + or — depending on whether 
the direction of the segment A'B’ coincides with the positive direc- 
tion of the axis or is opposite to it. A projection of a vector on ano- 
ther vector is defined similarly. In this case the perpendiculars 
are drawn to the other vector or to its prolongation. Thus, @ projec- 
tion of a vector is a scalar; the dimension of a projection coincides 
with that of the projected vector. 


220 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


The basic properties of projections are the following. 

1. The sign + or — indicates that while moving from the origin 
of the vector to its terminus we go, respectively, forwards or back- 
wards with respect to the positive direction of the axis. A projection 
equals zero (i.e. A’ coincides with B’) if and only if the vector is 
perpendicular to the axis (see Fig. 153). 


Fig. 153 
proj, a> 0, proj; b<0, proj; e=0 


2. Parallel translation does not change a projection of a vector- 
3. Contemplating the triangle TMN in Fig. 152 we conclude 
that 


proj, a = TM cos a = a cos (a, l) (2) 


[the symbol (a, 2) denotes the angle between the vector and the 
axis]. According to this formula, the sign of the projection is deter- 
mined by the sign of the cosine. Therefore if the angle is acute its 
cosine is positive and the projection is also positive, if the angle 
is obtuse its cosine is negative and the projection is also negative. 
The case when the angle in question is acute is depicted in Fig. 152. 

4. A scalar factor may be taken outside the projection sign: 
proj, (Aa) = À proj, a. Certainly, if the length of a vector is increa- 
sed (that is the vector is stretched) or decreased several times its 
projection will change in just the same way. 

5. A projection of a sum is equal to the sum of the projections: 
proj, (a + b) = proj, a + proj, b (see Fig. 154). 

7. Scalar Product. The scalar product of two vectors a and b is 
defined as the product of the absolute values of the vectors by the cosine 
of the angle between them. It is designated by the multiplication. 
sign (-) or by the parentheses: 


a-b = (a, b) = ab cos (a, D) (3) 


Thus, the scalar product of two vectors is a scalar. It should be 
noted that the notion of a scalar product computed according to 
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formula (3) cannot be extended to more than two vectors. Bearing 
in mind formula (2) we can also put down the relation 


a-b = b projp a = @ proja b (4) 


Therefore, the scalar product of two vectors is equal to the product 
of the absolute value of one of the vectors by the projection of the other 
vector on the first. 

Example. Let s be a displacement vector of a material point and 
let F be a constant force (one of the forces acting upon the point 
in a process of motion) depicted in Fig. 155. To reckon the work A 
performed by this force we must take into account only the com- 
ponent F’ of the force F along the direction of the displacement. 
Hence, A = s projs F = s-F. ‘ EN R TON 

The dimension of a scalar product is equal to thiet of dP, à 

S a 
S / 


mensions of the factors. D 
ue \ 
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g 


Fig. 154 


Formulas (3) and (4) are simplified if one of the factors (or both 
factors) is a unit vector, that is a vector with the dimensionless 
absolute value 1. For example, if e1, €z and e are unit vectors, 


ejeg = cos (e1, es), are = proje a (5) 

A unit vector lying on an axis l is usually denoted as 1°. Simi- 

larly, a unit vector parallel to a vector b and having the same direc- 

tion is denoted by b°. Taking into account formula (5) we can write 
proj, a = proje a = a-I?, proja = a:b” 


8. Properties of Sealar Product. l 
4. A scalar product is equal to zero if and only if the vectors are 


orthogonal, that is perpendicular to each other: 
a-b = 0 is equivalent to a L b 
Virtually, this follows from (3) since cos (a, k b) = 0 implies 
(a, b) — 90°. Of course, it may happen that a = 0 but this means 
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that a = 0 and a zero vector can be regarded as being perpendicular 
to any vector (see Sec. 3). 


2. a-a = a? because (a,a) = 0° and cos(a,a) = 1. In other 
words, the scalar square of a vector equals the square of its abso- 
lute value. 

3. A scalar product does not depend on the order of its factors: 
a-b = b-a. [This immediately follows from formula (3).] 

4. A scalar factor may be taken outside the scalar product of 
vectors: 


(Aa)-b = a-(Ab) = A (a-b) (6) 


Really, on the basis of property 4 from Sec. 6, we have projy (Aa) = 
= À proja a. Multiplying both sides by b and taking into account 
formula (4) we derive b-(Aa) = A (a-b). The scalar product being 
commutative, the last relation implies formula (6). 

[We suggest that the reader should deduce formula (6) as a direct 
consequence of the definition of a scalar product.) 

5. Distributive law: 


(a + b)-c =a-c+b-e 


To prove this property we write proje (a + b) = proje a + proje b 
on the basis of property 5 in Sec. 6. Then we multiply both sides 
of the last relation by ¢ which yields the desired formula. 

The enumerated properties enable us to compute the scalar pro- 
duct of linear combinations of vectors. For instance, 


(a + 2b)- (2a — 3b) = 2a-a — 3a-b + 4b-a — 6b-b = 
= 2a? + a-b — 6b? 


§ 3. Cartesian Coordinates in Space 


9. Cartesian Coordinates in Space. Let us take a triad of vectors 
a, b and e drawn from a common origin at a point O and not lying 
in one and the same plane. We choose the point O as the origin, 
of coordinates. The origin of a coordinate system given, the posi- 
tion of an arbitrary (variable) point M in space is completely spe- 


cified by the vector r = OM. This vector is called the radius-vector 
of the point M (see Fig. 156). As proved in Sec. 5, we can: take the 
vectors a,b and ¢ as a basis and represent the radius-vector in the 
form r = Aa + pb + ve. Thus, the position of the point Mis cha- 
racterized by the triple of numbers A, w and v which are called the 
affine coordinates of the point M. In accordance with the end of 
Sec. 5 we'can say-that the affine coordinates of a point are the,coor- 
dinates of its radius-vector. Therefore, every point in space, just 
as a point in a plane, has’ certain coordinates and, conversely, thè 
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coordinates being given, we can always construct the corresponding 
point (but in space a point has three coordinates). 

If vectors chosen as a basis of a coordinate system have unit 
lengths and are mutually perpendicular the coordinate system is 
called a Cartesian system. In such a case vectors forming a Cartesian 


Fig. 156 Fig. 157 


basis are usually designated by the letters i, j and k. Cartesian coor- 
dinates are usually denoted as z, y and z. Thus, according to Fig. 157, 
we have 
r = zi + yj + zk (7) 

In Fig. 157 we see the point M with the coordinates z = ae 
y = 2.4 and z = 1.6. The planes xOy, yOz and xOz (the coordinate 
planes) divide the space into eight parts (octants). The signs of the 
coordinates of a point show in which octant the point lies. 

Similarly, any vector a can be represented (with respect to a coor- 
dinate system) in the form analogous to (7): 


= a,i + a,j + ak (8) 


where ax, a, and a, are the projections of the vector a on the cor- 
responding coordinate axes. Taking into account that for any unit. 


vector e we have, by formula (2), proj, e = cos (e, 2), we deduce 
from formula (8) the relation 
e = cos (e; 2) i + cos (e, y) j + cos (e; z) k 


10. Some Simple Problems Concerning Cartesian Coordinates. 

4. Operations on vectors represented in terms of their projections 
on the axes of a Cartesian coordinate system are performed according 
to the following simple rules. If [see formula (8)] we have 


a= ait ajtak and b= bz + b,j + bk 
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then (see Secs. 2, 4 and 8) 
a + b = (ay + bx) i+ (ay + by) j + (a2 + b-) k (10) 


Aa = hayi + Aayj + Aak (11) 
a-b = axby + dyby + azbz (12) 
& = aa = a + a +a (13) 


To deduce the last two formulas which are of great importance we 
should notice that i-i = j-j = k-k =1 and i-j=j-k=—k-i=0 
since i, j and k are mutually perpendicular unit vectors. Formula 
(13) is nothing but the expression of Pythagoras’ theorem in space: 
the square of the length of the diagonal of a rectangular parallele- 
piped equals the sum of the squares of the lengths of its sides. 

For example, let it be required to determine the angle between 
the vectors a = 3i— 2j +k and b = —2i + j + 4k. We have, 
by formula (3), 


a Mies 3(—2)+(—2) 141-4 
cos (a, b) = T ave V3 (—22+ 2 V(—2329+2+2 
wok Dee ee 
V14-21 i 


from which we obtain, within the accuracy guaranteed by a slide 


tule, (a, b) = 103.5°. 

2. The parallelism and perpendicularity conditions for vectors 
given in the form of representation (9) can be derived as follows. 
The condition a || b is, by Sec. 5, equivalent to the relation of the 
form b = Aa or, on the basis of formula (11), by = Ady, by = Ady, 
b, = ia,. Now, eliminating 4, we get the desired condition 

Bie Oy oes 


ne EaI (which is the condition of a || b) 


Further, according to Sec. 8 (property 1), the condition a | b 
is equivalent to the equality a-b = 0 and, by formula (12), we 
deduce the condition 


Axby + ayby + a,b, = 0 (which is the condition of a | b) 


3. The cosines of the angles which a vector forms with coordinate 
axes are called the direction cosines of the vector. If a vector a is 
given in the form of its representation (8) then we have, by formula 


(2), ax = proj, a = a cos (a, zx) etc., that is 


=~ ax ~ ay a~ a, 
cos(a,z)=——, Cos (a,y)=—, cos (a, 2) = 
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This implies 


a2 
ay 


~ ~ ~ 
cos? (a, x) + cos? (a, y)-- cos? (a, z) = 


The direction cosines of a vector E at determine its direc- 
tion but give no information about its length. 

The direction cosines of an axis are defined in like manner: they 
are the direction cosines of an arbitrarily taken vector parallel 
to the axis and having the same direction. 


My (x1, Yi, 2) 
M(x, y, 2) 


My (Xis Yrs 21) 


Mg (Xz: Ya, Z2) 


Fig. 158 Fig. 159 


4. The vector connecting two given points M, (a, ys, 24) and 
Mz (£2, Y2, 22) can be found as follows. According to Fig. 158 we 


have OM, =r, = i+ yj + ak and OM, = r = zi + yà + 
-+ Zk which implies (see Sec. 3): 
MM, = r — 1, = (@2 — 11) i + (y2 — Y1) į + (Z2 — 21) k 


5. The distance between two points My, (x, ys, %) and 
Mz (£z, Yo, 22) is equal to 


M\M,= V (MM)? = V (@2— 21) + Yo— 1)? + (e—21)? (14) 


which follows from the precéding argument. This formula is very 
much like the corresponding formula for a plane [see formula (I1.1)]. 
6. Dividing a segment in a given ratio. Suppose we have 


M,M 
M (£i, ys, 21), Mo (ao, Y2 Zə) and WM, = =) 
(see Fig. 159). It is required to find the point M (x, y, z). We have 
n = gi ar yj + zk; r= Lei + yj + zk (15) 


The vootor MA M and MM, being Parallel, we see that MI M= 
= AMM». But M,M = r — r; and MM, = r} — T, that isr — r; = 
1415—0141 
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=A(te—F), tu = Arg — Ar and r+ Ar =r + Arg. Hence, 

my Ar $ 

r= ea (16) 

Equating the projections of both sides on the axes xz, y and 2 
[see formulas (7) and (15)] we finally deduce 


ay hve ries ya + Ayo sss z+ Azz 7 
TERN 3 e E CEATA (17) 
When passing from formula (16), which represents the’ solution 

of the problem in a vectorial form, to formula (17), we have projected 


r= 


Fig. 160 


the vectorial formula on the coordinate axes. Generally, it is clear 
that every vectorial equality of the form a = b considered for vec- 
tors in space is equivalent to the three scalar equalities 


ay = bx, Ady = by, a = bz 


which are obtained as a result of projecting the former equality 
on the coordinate axes. 

7. Translation of coordinate axes. Let the coordinate axes 2’, y' 
and 2’ be obtained by means of a translation of the axes a, y and 2, 
the displacement vector being a (see Fig. 160). Then we have the 
relation r = a -+ r’ for the radius-vectors of any point M. Projecting 
this relation on the coordinate axes we obtain the formula 


g= g + az, y=y +4 2=2'+ 4, 


which describes the connection between the new coordinates and 
the old ones (see problem 3.1 in Sec. II.2). 
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§ 4. Vector Product of Vectors 


41. Orientation of Surface and Vector of an Area. A surface in 
space is called oriented if there is an indication as to which of its 
sides is regarded as outer and which as inner. As a rule, such an 
orientation can be performed in two ways (see Fig. 161). Even when 
we have a Closed surface (e.g. a sphere) it is sometimes convenient 


Inner side 


Outer side 
b 


Fig. 164 


to introduce an “unnatural” orientation, that is to regard the outer 
(in an ordinary, “everyday” sense) side of the surface as inner and 
vice versa. 

An orientation of a non-closed surface can also be determined 
by pointing out the direction of describing its contour. Thus, we 


V/A 


Outer side Outer side 
Right-handed screw Left-handed screw 
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describing 
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Fig. 162 


have two methods of indicating an orientation of a non-closed sur- 
face. To establish the connection between them it is necessary to 
point out, in addition, whether we take the right-hand screw rule 
or the left-hand serew rule which are illustrated in Fig. 162. For 
instance, the right-hand screw rule can be stated in the following 
way: if we take a right-handed screw (which is usually used in engi- 
neering and everyday life) and rotate it in the direction of describing 
the contour it must move from the inner side of the surface to the 
outer side. Or, in other words, if we imagine that a man walks on 


15* 
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the outer side of the surface along its contour in the positive direc- 
tion of describing the contour he must see “the precipice” on the 
right, and the surface itself should remain on the left. 

When we consider an oriented part of a plane it sometimes turns 
out that only its area and its orientation in space are important 
whereas the specific form of the part (that is whether it is a circle 
or a rectangle etc.) does not matter. In such circumstances we can 
represent this part (S) of the plane by means of a vector which is 
perpendicular to (S) and is directed from its inner side to the 
outer side (see Fig. 163) whereas its absolute value is taken 


Contour 


T As 
lection oF dosct ibe" 


Fig. 163 Fig. 164 


equal to the area of (S). We shall call such a vector the vector 
of the area (S) and denote it as S. Obviously, this vector com- 
pletely defines the area and the orientation in space of such a sur- 
face element. ; 
Let us discuss an example of using the vector of an area. Suppose 
we have a homogeneous gas flow, that is a flow in which the velo- 
city v of the particles is the same at all points. Now imagine that 
we put an oriented plane surface element (S) into the flow, and 
it is required to determine the volume of the gas which passes 
through (S) during the unit interval of time from its inner 
side to the outer side. Since the volume of the gas passing 
in unit time fills a cylinder with base (S) and altitude 
|v | cos (S, v) (think why it is so), the sought-for volume is equal to 
|S | |v | cos (S, v) = Sv. 
- 12. Vector Product. The vector product of two given vectors a 
and b is defined as the vector of the area of the parallelogram con- 
structed on the vectors a and b (when the vectors are translated to 
a common origin) and oriented so that we should begin with the 
first vector (i.e. with a) while describing the contour of the paralle- 
logram in the positive direction. The definition is illustrated in 
Fig. 164. We shall use the right-hand screw rule throughout our 
course unless the contrary is stated. We have also used the right- 
hand screw rule in Fig. 164. 
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Now let us introduce the notion of a directed triad of vectors 
which is important for our further purposes. Let three vectors a, b 
and ¢ with a common origin be given. We shall regard the vectors 
as taken in a certain order, that is a is the first vector, b the second 
and c the third one. Let us also suppose that the vectors do not 


a ab 


Right-handed triad Left-handed triad 
Fig. 165 


lie in the same plane. Such a triad is said to be right-handed if the 
shortest rotation from the vector a to the vector b is seen to be in 
the counterclockwise direction when we contemplate the rotation 
from the terminus of the vector e. If the rotation is seen to be in 
the clockwise direction the triad is called left-handed. Right- and 


Fig. 166 


left-handed triads are shown in Fig. 165. The origin of this termi- 
nology is illustrated in Fig. 166. Note that if we interchange the 
numbers of two vectors retaining the third one at its original place 
then the orientation of the triad will change. For instance, if ‘a 
triad a, b, c is right-handed then the triad a, c, b is left-handed 
(check it!). The orientation of a triad does not change when we per- 
form the so-called circular permutation, that is when the third 
vector is substituted for the second vector and the first vector is 
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substituted for the third one (or the other way round). For example, 
if a triad a, b, c is right-handed then the triad b, c, a is also right- 
handed. 

The orientation of a Cartesian triad i, j, k must always correspond 
to the screw rule that we choose. Thus, i, j, k must form a right- 
handed triad in case the right-hand screw rule is chosen and they 
must form a left-handed triad if otherwise. In accordance with the 
tule we distinguish between the so-called right-handed and left- 
handed Cartesian coordinate systems. For instance, the Cartesian 
system in Fig. 157 is a right-handed one. Now we can formulate one 
more definition of a vector product which, as it can be seen in 
Fig. 164, is equivalent to the former definition. 

The vector product of two vectors a and b is a vector ¢ which is 
directed perpendicularly to either vector, has an absolute value 
equal to the area of the parallelogram constructed on the vectors a 
and b and forms a triad with the vectors a and b (the triad a, b, c) 
directed as the triad i, j, k (that is the triads a, b, c and i, j, k have 
the same orientation). Thus, the triad a, b, c is right-handed or 
left-handed in accordance with the Cartesian triad i, j, k. The vector 
product of the vectors a and b is denoted as a x b or [a, bl. 

13. Properties of Vector Product. 

1. The absolute value of the vector product a X b is equal to 


|a x b |= ab sin (a, b). Indeed, this expression corresponds to 
the formula of calculating the area of a parallelogram. 

The last formula together with formula (3) and property 2 in 
Sec. 8 implies the following consequence: 


(ax b)?-++ (a+b)? =|a x b|? + (a-b)? = 
= ab? sin® (a, b) -|- ab? cos? (a, b) — OOF 


It should be understood that the first summand on the left-hand 
side is the scalar square of a vector (i.e. of a » b) whereas the second 
one is the square of a scalar [of the scalar (a-b)]. 
2. A vector product is equal to zero if and only if the vectors are 
parallel: 
a x b = 0 is equivalent to a || b 


Indeed, if the vectors are parallel then the corresponding paralle- 
logram degenerates into a line segment having the zero area and 
vice versa. In particular, we always have a x a = 0. 

3. The vector product is anticommutative: 


b X a = —(a X b) 


Really, changing the order of the factors does not affect the form 
of the parallelogram but its contour will be described in the opposite 
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direction and the vector of the surface element will therefore be 
replaced by the opposite one. 
4. A scalar factor can be taken outside the vector product: 


(Aa) x b=a x (Ab) = à (a X b) 
since increasing the length of one of the sides of a parallelogram 
à times results in increasing its area à times. (If A< 0 then the 
directions of the corresponding vectors are replaced by the opposite 


directions but the above rule remains true.) 
5. Distributive law: 


(a+b) Xec=axc+bxe, 
eX (a+b) =c¢ X atic Xb (18) 


To prove the law let us translate the vectors a, b and c to a com- 
mon origin O and draw a plane (P) L c through O (see Fig. 167). 


Fig. 167 


OK,S;R, is the projection of the parallelogram OKSR; OK'S’R is obtained by turning and 
stretching the parallelogram OK,S;R, 


Now we construct a parallelogram on the vectors a and b, its dia- 
gonal being d. Let us project the parallelogram on the plane (P); 
Next we turn the projected parallelogram about the axis ¢ through 
90° and simultaneously extend it c-fold (the whole construction 
is shown in Fig. 167). It is clear that 


d=a+b, d'=a'+)D’ (19) 


where a’ and b’ are the sides of the third parallelogram and d’ is 
its diagonal. 

For the sake of convenience the operations on the vector a per- 
formed in the above construction are depicted separately in Fig. 168. 
We see that a’ | OK anda’ | e [because c | (P)] and the vector 
a’ is therefore perpendicular to the plane KOM. Besides, a = 
= cOK,, the last product being equal to the area of the parallelo- 
gram OKLM constructed on the vectors a and c (why is it so?). 
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Consequently, by the definition of the vector product, a’ =a xX c. 
Similarly, b’ = b x ¢ and d’ = d x c. Now we deduce from for- 
mulas (19) the first formula (18): (a + b) x c =d x c = d = 
=a' +b =a> c+b ce. Changing the order of the factors 
and simultaneously changing the signs we receive the second for- 
mula (18). 

Thes properties enable us to remove brackets in expressions con- 
taining vector products but in doing this we should pay attention 
to the order of factors. Here we give 
an example: 


(a + 2b) x (2a — 3b) = 2a x a — 
— 3a x b+ 4b x a— 6b x b = 
=—Ta X b 


It should be noted that the asso- 
ciative law does not hold for the 
vector product: (a > b) x c may be 
unequal to a (b x c). That is why 
expressions, of the form a x b~ ¢ 
must not be used without brackets. 

6. Expressing vector product in 
terms of Cartesian projections. 

Let vectors a and b be given in the form of their representations 
(9). To express their vector product in terms of their Cartesian pro- 
jections we shall utilize the following equalities: 


ai ky eek — ky fk i, k x j = —i, 
EET k 4 
An essential fact is that these equalities do not depend on the par- 
ticular choice of a Cartesian coordinate system and on the choice 


of the right-hand or left-hand screw rule. The verification of these 
equalities is left to the reader. Now we have 


a» b= (axi + aj + ak) x (bpi + byj + b-k) = 
= axbyk — axbzj — aņb;k + aybzi + a,b,j — azbyi = 
=i (aybz N azby) i j (axb Ert azbx) F k (a,by Ty ayb) (20) 


The result thus obtained can be rewritten in the form of a deter- 
minant [see formula (VI.8)]: 


Fig. 168 


i yok 
aX b=] ax dy az 
by by bz 


This is a remarkable formula! In this form the formula for the vector 
product is easy to memorize. 
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Suppose it is necessary to evaluate the area S of a parallelogram 


constructed on the vectors a = 3i — 2j + k and b = —2i + j + 
+ 4k. As we know, S = |a © b |. Calculating we find 
i jk 
axb=| 3 =2 4/=41(—8—1)—j(124+2)+k(8—4)= 
—2 ; 
= —9—14j—k 


and thus 

S=|ax b|=) 9+ 147+ 1? = 16.7 
The result is dimensionless since the vectors a and b were dimen- 
sionless. If we wanted to receive the “genuine area” we should put, 
for example, a = 3i cm and the like. 

Now we point out one more useful formula. Let us take two vec- 
tors a = api + a,j and b = b,i + b,j in the z, y-plane. Suppose 
it is necessary to evaluate the area S of the parallelogram construc- 
ted on a and b. Since 


ey ok 
üs Cy 
axb=|@x ay O}= " b k 
bab0 EE ae 
the geometrical meaning of the vector product implies (verify it!) 
dy ay F Gs ay 
= hat = 2 
fee +5, thatis S=| or | (21) 


We take -+ or — depending on whether the direction of the shortest 
rotation from a to b coincides with the direction of the rotation 
from i to j or not. 

The notion of the moment of a vector a applied at a point M about 
a fixed point O is connected with the notion of a vector product. 


The moment mom, a is defined by the formula mom, a = OM x a = 
= r x a. In physics we consider the moments of a force, of a velo- 
city, of a momentum and so on. If we change the position of the 
origin of the vector a then, in general, its moment will also change. 
Let us now translate the vector a along its “line of action” to a point 


UR 
M’. Denote a in the new position as a’. Since MM’ = ia we have 
momo a’ =r xX a= (r + ħa) Xx a =r X a = mom a 
Thus, such a translation does not affect the moment and we may 


therefore regard the vector a entering in the definition of a moment 


asřa sliding vector (see Sec. 1). 
44. Pseudovectors. There are vectors which depend on whether 
we choose the right-hand or the left-hand screw rule so that when 
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one of the rules is replaced by the other the directions of these vec- 
tors are replaced by the opposite ones. Such vectors are called pseudo- 
vectors (or axial vectors) in contrast to true vectors whose direction 
is independent of the choice of a screw rule. For instance, the velo- 
city vector of a translatory motion of a solid does not depend on 
the choice of a screw rule (this is implied by its physical meaning) 
and is therefore a true vector. On the contrary, the angular velocity 
vector of a rotary motion of a solid is a pseudovector. Indeed, 


w Il 


Tig. 169 


such a vector lies in the axis of revolution and its absolute value 
is equal to the numerical value of the angular speed but its direction 
depends on the choice of a screw rule (see Fig. 169). 

It is clear that by the definition of the vector product of two true 
vectors a vector product is a pseudovector because if we change the 
screw rule the former outer side of the parallelogram constructed 
on the vectors a and b becomes the inner side and vice versa, The 
same reasoning shows that the moment of a force (see the end of 
Sec. 13) is a pseudovector. Further, the vector product of a true 
vector by a pseudovector is a true vector whereas the vector product 
of two pseudovectors is also a pseudovector. It is also easy to verify 
that the linear velocity v of any point M of a rotating solid which 
is a true vector is connected with the pseudovector œ by the for- 


ma 
mula v = @ x r where r= OM and O is an arbitrarily chosen 
point lying on the axis of revolution. 

We sometimes also distinguish between “true scalars” and pseudo- 
scalars. A pseudoscalar is a scalar which is multiplied by —1 when 
the original screw rule is changed. For example. it is easy to verily 
that the scalar product of a true vector by a pseudovector is a pseudo- 
scalar, 
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§ 5. Products of Three Vectors 


15. Triple Scalar Product. Let three vectors a, b and c be given. 
The scalar quantity (a » b)-e is called the triple (mixed) product 
of the three vectors a, band c. The geometrical significance of the 
triple product is seen in Fig. 170: 

. (a x b)-e = dee = d proja ¢ = |a X b | proja ¢ = Sh = V 


that is we have obtained the volume of a parallelepiped constructed 
on the vectors a, b and c. The vectors a, b and ¢ in Fig. 170 form 


projac=h 


Fig. 170 


a right-handed triad, and the volume is obtained with the sign +. 
If the triad were left-handed the angle between c and d would be 
obtuse. In this case (aX b)-e = —V. (We suppose here that our 
considerations are based on the right-hand screw rule as it was poin- 
ted out in Sec. 12.) 

Let us indicate the following properties of a triple product. 

4, A circular permutation of factors does not change their triple 
product since neither the parallelepiped (see Fig. 170) nor the orien- 
tation of the triad of the vectorial factors changes under such a trans- 
formation (see Sec. 12): 


(a x b)-c = (bX c)-a=(C x a) +b 


But if we interchange only two factors the sign of the triple product 
changes. For example, (¢ x b)-a = —(axb)-c. 

2. A triple product is equal to zero if and only if the three vec- 
tors are parallel to a plane. Indeed, such a parallelism means that 
the corresponding parallelepiped degenerates into a plane geomet- 
rical figure, that is it has a zero volume. 
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9! 


3. By formulas (20) and (12) we deduce the following expression 
of a triple product in terms of Cartesian projections: 


(a x b) c= [(aybz re azby) A (axbz Ta azbx) j a 
+ (aby — ybs) kl- (cxi + Cyj + ck) = 
ay (aybz oes aby) Cx — (a,b, — azbx) Cy + (axby T aybx) Cz 


or, finally, 
Gs. Gy! az 


(axb)-c=|by by b, 
Cy. Cy . Cz 


(to verify the formula expand the determinant in minors of the last 
row according to Sec. V1.3). 

Let three vectors a, b and ¢ be given in the form of the expression 
in terms of their Cartesian projections. Then, by the above formula 
and property 2, we obtain the necessary and sufficient condition 
for these three vectors to be parallel to the 
same plane: 


g bs by bz |=0 (22) 
4h f Cx Cy Cz 


The triple product of three true vectors (see 

0 Sec. 14) is a scalar product of a pseudovector 
by a true vector, i.e. a pseudoscalar. 

T 16. Triple Vector Product. The vector 
(a b) > ¢ is called the triple vector product 
of three vectors a, b and c. It has no important 
geometrical meaning but is expressed by a 

formula which is of use for some applications. To deduce this 
formula let us choose the Cartesian coordinate axes in such a way 
that the z-axis is directed along the vector a and the y-axis 
lies in the plane of vectors a and b (see Fig. 171). Then the 
projections of the vector a on the y-axis and on the z-axis will be 
equal to zero, that isa = a,i. Similarly, b = b„i + b,j and 
© = cyi + cyj + ck. From this we obtain 


Zz 


Fig. 171 


bio dy woke 
axb=|a, 0 0 j= axbyk, 
by by =O 
y k 
(axb)xe=/0 0 ab E —İagbycy + jaxbycx = 
Cæ ly Cz 


= ayy (bxi + b,j) — (bye, + byy) api 
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(check up these formulas!). Finally, using formula (12) we get 
(a x b) x c = (a-c) b — (b-c) a 


This final formula no longer contains any coordinate projections 
and therefore does not depend on the particular choice of the coor- 
dinate system. 

The following formula is also sometimes of use: 


a x (bx c) = —(b X c) X a = —[(b-a) c — (c.a) b] = 
= (a-c) b — (a-b) c 


In conclusion we note that vector algebra is a comparatively new 
branch of mathematics. It was created and developed in the second 
half of the 49th century in connection with problems of algebra, 
geometry, mechanics and physics. 


§ 6. Linear Spaces 


17. Concept of Linear Space. One of the characteristic features 
of vectors is the possibility to perform linear operations on them, 
that is the operation of addition and the operation of multiplica- 
tion by numbers (see § 1). These operations can also be performed 
on some other objects, such as polynomials or arbitrary functions. 
Since these operations have similar properties in all cases, there is 
every reason to consider the general notion of a linear space which 
is understood as a set of some objects such that the linear operations 
can be performed on them within the set. Such a general, abstract, 
consideration yields a general view on linear operations which 
enables us to find some important properties in concrete problems. 

Let (R) be a set (totality) of some objects. We shall call these 
objects elements (members) of (R). Ifa is one of the objects we also 
say that a belongs to (R). This fact is designated as a € (R). The 
fact that an element b does not belong to (R) is written as b € (R). 
For instance, if (/) is the set of all integers then 3 € (J), —5 € (/) 
but x € (J). 

Now we turn to the strict definition of the concept of a linear space. 
A set (R) is called a linear space if for any x € (R) and y € (R) the 
notion of the sum x + y € (R) is defined in a certain way, and if 
for any real number 4 the product Ax = xA € (R) is defined. For 
example, we can regard the set of all vectors for which the addition 
and the multiplication by numbers are performed in accordance with 
the rules of § 1 as a space (R). The set of all complex numbers for 
which the rules of addition and multiplication by real numbers are 
known from elementary mathematical courses can also be regarded 
as a linear space. Some other examples will be given in Sec. 48. 
Besides, the operations of addition and multiplication must have 
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some natural properties, namely those properties which were proved 
in § 4 for vectors and which should be introduced as the azioms 
of a linear space in the general case. The properties being essentially 
the same as those of vectors, the elements of a linear space are 
also often called vectors and are denoted like vectors. 

Here we shall give the axioms of a linear space without discussing 
the question of their independence. The thing is that some of these 
axioms are the consequences of the rest but this is of no importance 
for the aims of our course. Thus, the sum must satisfy the following 
conditions: 

1. Associativity: (x + y) +z =x + (y + z) for any x, y, z € (R). 

2. Commutativity: x + y = y +x for any x, y€ (R). 

3. The existence of a zero element in (R): there must be an ele- 
ment [denoted as 0; 0 € (R)] which satisfies the condition x + 0= 
=x for any x € (R). 

4. The existence of the negative of x € (R): for any x € (R) there 
must be an element [denoted as —x; —x € (R)] which satisfies the 
condition (—x) + x = 0. 

It is easily verified that the zero element of a space-must be unique 
and that there is only one negative for any element. We shall not 
discuss the general proof of these facts (in all the concrete examples 
which we shall consider here these properties are obvious). 

The multiplication of an element by a number must satisfy the 
following requirements: 

a: A (ux) = (Ap) x. 


x= 


The operation of division by a number is introduced by the formula 
x 


= tx (#0). 


Finally, both linear operations are connected by the distributive 
aws: 

10. (A + p) x = Ax + px and 11. À (x + y) = Ax + Ay. 

_ All these properties make it possible to perform linear operations 
in linear spaces and transformations of linear combinations of ele- 
ments (see Sec. 5) according to the usual arithmetical rules. Linear 
spaces and their properties are treated in detail in courses on linear 
algebra (see, for example, [16]). 

It is sometimes necessary to consider sets of elements in which 
only one operation of addition satisfying axioms 4-4 is defined. Such 
a set is called an Abelian group after N. H. Abel (1802-1829), a Nor- 
wegian scientist and one of the most prominent mathematicians 
of the 19th century. Besides, it is sometimes possible to perform the 
multiplication not only by real numbers but also by any complex 
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numbers. A linear space of this kind is called a linear space over 
the field of complex numbers or, briefly, a complex linear space. The 
term “a number field” is applied to any set of numbers in which 
it is possible to perform the four fundamental operations of arithme- 
tic with the natural exception of the division by zero. For example, 
all the real numbers or all the complex numbers form a field whereas 
the set of all integers does not form a number field (why is it so?). 

18. Examples. 

1. One of the simplest examples of a linear space is the set of all 
ordinary vectors with the linear operations described in § 1. The 
set of all vectors parallel to a certain plane is a linear space which 
is a linear subspace of the previous space. The set of all vectors 
parallel to a certain straight line is also a subspace of the space of 
all vectors. The zero vector itself is a linear space, from the formal 
point of view, since all the axioms 41-11 are fulfilled here. 

In the general case a set (R;) contained in a linear space (R) is 
called a linear subspace of (R) if (R,) itself is a linear space with 
the operations which are originally defined for elements of (R). In 
other words, there must be x + y€(R,) and Ax€ (R,) for any 
x € (R,), y € (Ry) and an arbitrary real A; if these two conditions 
hold the verification of axioms 1-11 is no longer needed since they 
are fulfilled in the whole space (R). 

2. The set of all polynomials P (x) of degree not higher than n 
where n > 0 is a given integer in a linear space. If n assumes succes- 
sive values 0, 1, 2 etc. we get a sequence of linear spaces and each 
subsequent space contains all the preceding ones as linear subspaces. 
A still more general linear space is formed by the totality of all 
functions f (x) defined over a fixed interval. Thus, each of these 
functions f (z) can be regarded as a vector of the linear space of 
functions or, as we say, of a functional space. This approach is 
characteristic of modern mathematics. 

3. The so-called n-dimensional real Cartesian space Æp where 
n = 1, 2, 3, ... represents a very important example of a linear 
space and we shall consider this notion here. 

We regard each ordered n-tuple (ai, dg, . - +1 Gn-1) an) of real 
numbers l4, Gg; . + +> n-i» An aS an element of En. Take for example 
E,. Then each element of the form (a4, Qo, Qg, a4) Where ai, do, ag 
and a, are arbitrary real numbers is a point or a vector of Æ, (ordi- 
nary geometrical vectors may be regarded as points of E, since we 
can regard them as radius-vectors of points; the vectors, i.e. the 
elements, of a general linear space are therefore also called points). 

The numbers a;, as, 4 and a, are called the coordinates of the point 
(of the vector) (ai, a, as, a,). For example, (—3, 0, 4, 3) is one of 
such points. The point (0, 0, 0, 0) is called the origin (of the coor- 
dinate system) in Æ,. The whole space Æ, is the set, the totality, 
of all such points. 
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The space Æ, can be represented in the most visual way. Indeed, 
suppose that we introduce an affine coordinate system or, in parti- 
cular, a Cartesian coordinate system (see § 3) in the usual geometri- 
cal space and then “rivet” it, that is we fix this system. If now we 
regard the ordered set of the coordinates of every point as the point 
itself we arrive at the space Æ. (How can we get Æ, and E, by the 
same reasoning?) The advantage of such a way of representing a space 
in the form of a space of number n-tuples is that we are not confined 
to a certain dimension and that Æ, is considered in the same manner 
fOr alle yO eh Be: 

Let us return to Z,. We shall agree that every pair of points 
A (aj, Go, az, Gq) and B (bi, ba, bs, by) determines a generalized 


vector AB with the origin A and the terminus B. We shall drop the 
word “generalized” and simply call it a vector. Such a vector is 
in fact nothing but an ordered pair of points. Since we usually 
consider free vectors (see Sec. 1) let us agree that a pair of points 
of the form A’ (a, + a, a, + B, a, + y, a, + ô) and B’ (b, + a, 
ba + P, bs + y, ba + 5) determines one and the same vector for 
any a, B, y and ô, namely the vector determined by the pair A 


and B. This means that AB = AB. We shall say that the vector 


a -p 
A’B’ is obtained from AB by a parallel translation, the displace- 


> 
ment vector being A’A = B’B. Such a translation makes it possible 
to transfer the origin of any vector to any point of £,. For example, 
putting a = —a, B = —a,, y = —a, and Ô = —a, we transfer 


the origin of the vector AB to the origin of coordinates. Then the 
terminus of the vector will be placed at the point M Gis 25; Za, oo) 
where a, = bj — a, Ta = by — dp, T3 = by — ag and a, = b, — ay. 


— — 
It is natural to call the vector OM = AB the radius-vector of the 
point M. ; 
The differences between the corresponding coordinates of the 


CARN 
terminus and of the origin of a vector AB, that is the numbers 
by — a, by — ay, bs — az and b, — a, are called the coordinates 
of the vector. They do not change under any parallel translation 
of the vector. Thus, a free vector in a Cartesian space is completely 
characterized by its coordinates. If the coordinates of a vector and 
of its origin are known we can easily find the coordinates of its 
terminus, 

For example, if we “draw” the vector x (2, —5, 0, 1) from the 
point A (51, 0, 2, —4) its terminus will be at the point B (53, —5, 
2, —3). If we draw the same vector from the point O (0, 0, 0, 0) the 
terminus will be at the point M (2, —5, 0, 1). ; 
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The linear operations on vectors given in their coordinate repre- 
sentation are defined by formulas analogous to formulas (10) and 
(11): if vectors x (£1, Za, %3, £4) and y (yy, Yo, Ys, Ys) are given 
then the vector x + y has the coordinates (x, + Yo Ta + Yo, 
£a + Ys, T4 + ys), and the vector Ax has the coordinates (Axi, Ax, 
Avy, Ax). We can easily verify that all the axioms of a linear space 
hold in this case and, besides, 0 = (0, 0, 0, 0) and —x = (—a,, —2g, 
—Z5, —Z,). 

19. Dimension of Linear Space. Let a linear space (R) be con- 
sidered. The notions of a linear combination and of a linear depen- 
dence of vectors are introduced in the space in the same way as it 
was done in Sec. 5. But in the general case four given vectors may 
not be linearly dependent. We can have the following two possi- 
bilities here. 

1. It is possible to find n linearly independent vectors in (R) 
but any system of n + 1 vectors is linearly dependent. Then we 
say that the space (R) is n-dimensional. Any system of n linearly 
independent vectors of (R) is called a basis in (R) in this case. Thus, 
the dimension of a linear space is the maximal possible number of 
vectors belonging to the space which form a linearly independent system. 

2. It is possible to find an arbitrarily large number of linearly 
independent vectors in (R). Then the space (R) is said to be infinite- 
dimensional. 

The above definition of a dimension is in agreement with the 
ordinary idea of a dimension. Actually, we see that according to 
Sec. 5 the space of ordinary geometrical vectors is three-dimensional, 
the space of vectors which are parallel to a plane is two-dimensional 
and the space of vectors parallel to a straight line is one-dimensional. 
A space consisting of a single vector which is the zero vector is 
formally considered to be zero-dimensional. 

The following lemma which is useful for calculating the dimen- 
sion of a space is quite simple: let each of the vectors x,, Xs, ... 


..., Xp be a linear combination of the vectors yı, yo, .-., Yı 
where k >l. Then the vectors x;, Xə, ..., Xa are linearly depen- 
dent. 


We shall not give the general proof of the lemma here and restrict 
ourselves to demonstrating it by a special case. For example, suppose 
the vectors x;, X and x, are linear combinations of the vectors 
yı and ya. Then 


x, = ay, + Bye, Xa = py: + bye, Xs = ey; + Cys (23) 


If the determinant D = i$ 3 Æ 0 we can regard the first two 


equalities as a system of equations in y, and y,. Solving these equa- 
tions we express y, and y, in the form of linear combinations of 
x, and x. Substituting the expressions thus obtained into the third 


16-0141 
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equation (23) we arrive at a linear dependence between x,, X, and Xz. 
If D = 0 the right-hand sides of the first two relations (23) are 
proportional to each other and therefore even x, and x, are linearly 
dependent here, the same being true, of course, for X4, X, and xz. 

Let us now consider as an example the linear space of all polyno- 
mials P (x) of degree <2 (see Sec. 48), that is of polynomials of the 
form az? + ba + c. All the polynomials are linear combinations 
of the powers z*, zt = x and x = 1, the powers themselves being 
linearly independent. Indeed, none of the powers is a linear com- 
bination of the rest. It follows that the space in question is three- 
dimensional. The above lemma implies that any set consisting 
of more than three such polynomials is linearly dependent. The 
elements 1, z and z? form a basis in this space. 

Similarly, we conclude that the space of polynomials of degree 
<n has the dimension n + 1. The space of polynomials of all de- 
grees is infinite-dimensional. 

The linear space En (see Sec. 18), as we could naturally expect, 
is n-dimensional in the sense of our definition. We shall illustrate 
this property by taking the space Æ, as an example. Let us introduce 
the following véctors e €s es and e4: 


eb 020; 0); e 0,4; 0, 0), iie 0,0, 4, 0), 
e, (0, 0, 0, 4) 


Obviously, the vectors are linearly independent. Besides, every 
vector x (a, Te £3, %4) belonging to Ey is represented as a linear 
combination of ej, €z, es and e4: 


X (4, Tos Ly, L4) = Lyey + Toz + Laez F Teg 


Hence, by the lemma, the linear space in question is four-dimen- 
sional. The vectors €j, €s, € and e, form a basis in Fy. 

There is an infinitude of bases in every finite-dimensional linear 
space. For instance, let us consider the space of polynomials of 
degree <2. Choose three arbitrary values £% =a, z = fz, 
xz = z, and denote the polynomial of the second degree which is 
equal to 1 at the point z = a, (k = 1, 2, 3) and equal to zero at 
other points zı (l = 1,.2, 3, L k) by Pr (z). We can easily verify 
that P (c) = Eae . The polynomials P, (£) and P, (x) are 
expressed similarly. The three polynomials P, (x), P, (x) and 
P, (z) are a basis in the space in question. The representation 
of an arbitrary polynomial of degree <2 as a linear combination 
of P, (x), P, (x) and P; (x) is nothing but a special case of Lagrange’s 
interpolation formula (V.23). Of course, we suppose here that 
tm ety L k= 1, 2, 3, kA. 

We have already mentioned the notion of a linear. subspace (Ry) 
(see Sec. 18) of a linear space (R). It is possible to prove that if 
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(R;) =+ (R) and if the space (R) is finite-dimensional the dimension 
of (R,) is less than that of (R). The proof which we leave to the 
reader is rather simple; it is based on the definition of a dimension. 
By the way, it is convenient to use the following definition of the 
linear independence which is equivalent to the former definition 
when proving the above statement: vectors a, b, .. ., d are called 
linearly dependent if they satisfy a relation of the form aa + 
+ ph+...+6d= 0 where at least one of the numbers œ, B, ... 
..., Ô is different from zero. 

The important notion of a hyperplane is introduced with the aid 
of the notion of a linear subspace. Let us take a linear subspace 
of E, and draw all the vectors of the subspace from a certain point 
of En. Then the set of points which are the termini of the vectors 
is a hyperplane. One-dimensional hyperplanes in Æp are called 
straight lines and two-dimensional hyperplanes in £, are called 
planes. Besides, there are hyperplanes of dimensions 3, 4 etc. ‘up 
to n— 1. We also introduce the formal notion of a “hyperplane 
of dimension 0” which is simply a separate point and the notion 
of a “hyperplane of dimension n” which is the whole space En. It 
is possible to construct “a stereometry” in En. 

In conclusion we mention an important notion of an isomorphism 
of linear spaces. Two linear spaces (R) and (R’) are called isomorphic 
(or, more precisely, linearly isomorphic) if it is possible to establish 
a one-to-one correspondence between the vectors of the spaces in 
such a way that the linear operations on the corresponding vectors 
are performed according to similar rules. A more comprehen- 
sive statement of this property is that if the vectors x, y€ (R) 
correspond, respectively, to the vectors x’, y’ € (R’) then the vector 
x + y must correspond to the vector x’ + y’ and the vector 
Ax must correspond to the vector Kx’ i.e. 


(x+y =x ty, Ox! =a 


where the prime designates the transition from a vector of (R) to 
the corresponding vector of (R’). The term “one-to-one correspon- 
dence” means that only one vector from (R) corresponds to every 
vector from (R’) and vice versa. Isomorphic linear spaces are indis- 
tinguishable from the point of view of linear operations since all 
the implications based on such operations in one of the spaces are 
true for all the isomorphic ones. 

In particular, it follows that isomorphic linear spaces are of the 
same dimension. Conversely, all the finite-dimensional linear spaces 
of one and the same dimension are isomorphic to each other. Indeed, 
if py, Pz? -- +> Pn is a basis in (R) and pj, py, ..-, Pn is a basis 
in (R’) it is easy to verify that the correspondence of the form 


api + AP +H. + OnPn <> AP, F APs Foo + Aripa 
16* 
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yields a desired isomorphism. Hence, with respect to linear 
operations, every finite-dimensional linear space is completely 
characterized by its dimension. In particular, we conclude that 
each finite-dimensional linear space of dimension n is isomorphic 
to the vector space Ey. 

The theory of finite-dimensional linear spaces and its applications 
are studied in linear algebra whereas infinite-dimensional spaces 
are considered in functional analysis. 

20. Concept of Euclidean Space. A linear space (R) is called a 
Euclidean space if the notion of a scalar product of any two vectors 
of (R) is introduced and if the following natural axioms of a scalar 
product are fulfilled: 

12. The scalar product x-y = (x, y) of any two vectors x, y € (R) 
is a real number; 

13. (x, x) >O0 for any x4 0; 

14. (x, y) = (y, x); 

15. (x + y, 2) = (x, 2) + (y, z); 

16. (Ax, y) =A (x, y)- 

All these properties were proved for ordinary geometrical vectors 
in § 3. Thus, the set of all geometrical vectors is a Euclidean space. 
The set of all vectors parallel to a plane is also a Euclidean space. 
The same is true for the set of all vectors parallel to a straight line. 
It is clear that a linear subspace of a Euclidean space is always 
a Euclidean space. 

The set of all vectors of the n-dimensional Cartesian space En 
represents an important example of a Euclidean space if we intro- 
duce the scalar product of any two vectors x lepota ons +s Ty) An 
Y (Yi; Yo: - - ++ Yn) by the formula 


- (x, y) = Yi + Loe +--+ + Un (24) 


which is analogous to the well-known formula (12). It is easy to 
verify that all axioms 12-16 are fulfilled here. ‘ 

It is possible to introduce the notion of a length or, as we often 
say, of a norm of any vector by means of the formula |x | = 
= x, x). The norm of x is also denoted as || x |; Axiom 13 implies 
that the norm of any nonzero vector is positive. If the norm of a vec- 
tor is equal to unity the vector is called a normalized vector (for 
ordinary geometrical vectors we use the term “a unit vector”). From 
axioms 14 and 16 it straightway follows that 


| ax |= V Ox, bx) = VV =A V E = [4 /[ I 


This implies, in particular, that |0 | =0 and that each vector 
xÆ 0 can be normalized, that is we obtain a normalized vector 
by dividing x by |x l- 
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Let us now deduce an estimate for a scalar product of two arbi- 
trary vectors x, y € (R). In order to do this we note that we have 


O<|x+ Ay? = (+ Ay, x + Ay) = 
=(y, YA +2 (x, y A+ x) (25) 


for any A. If we regard the right-hand side of (25) as a quadratic 
trinomial in A the retention of its sign implies that the discriminant 
of the trinomial is non-positive, i.e. 


(x, y)? — (x, x) (, y) <0 
(why is it so?). From this we receive 


I(x, yI=V& <V E y =x] (26) 


This important inequality was established in 1821 for Z, by Cauchy 
(Cauchy in fact used a terminology different from that of the theory 
of linear spaces). For some other cases the inequality was deduced 
in 1859 by V. Ya. Bunyakovsky (1804-1889), a prominent Russian 
mathematician, and by the German mathematician H. Schwarz 
(1843-1921) in 1884. 

By inequality (26), we obtain, in particular, 


IxtyP=@t+y x+y = xy+2@%y+yyns 
<|xPt2izilyl+iyF 


which implies 
Ixtyl<ixit+lyl 


The last inequality is called the triangle inequality (think why it 
is called so). 

A Euclidean space over the complex number field (see Sec. 17) or, 
as we briefly call it, a complex Euclidean space is also considered in 
mathematics. In this case a scalar product of two vectors can be 
a complex number and axiom 14 is therefore replaced by a new 
axiom of the form (z, y) = (y, x)* where the asterisk indicates 
the conjugate complex number. In particular, this implies that 
(x, Ay) = (Ay, x)* = [A (y, x)]* = A* (x, y). We can easily verify 
that all other assertions of this section remain true for complex 
spaces. The n-dimensional complex Cartesian space Z, with the 
vectors of the form x (2%, Tz ++ +> Zn) where zi, .+ +) Zn are arbi- 
trary complex numbers is an important example of a complex 
Euclidean space if we introduce a scalar product by means of the 
formula (x, y) = 21yf + toyz +--: + z,y% which substitutes for 
formula (24). We leave to the reader the verification of all the axioms 
and, in particular, axiom 13 for this case. 

21. Orthogonality. The notion of an angle between any two vectors 
is introduced in a natural way in a Euclidean space (R) by the 
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-formula 


os) = en 


On the basis of Sec. 7 we see that the definition yields a usual angle 
in the case of ordinary geometrical vectors. It is necessary to note 
that, by inequality (26), the right-hand side of equality (27) does 


not exceed unity in its absolute value and therefore definition (27) 


always makes sense. In case vectors x and y are linearly dependent, 
i.e. y = Ax, formula (27) implies (x, y) = 0° if 4 +0 and (x. y) = 
= 180° if A < 0. Conversely, if cos (x, y) = +1 then inequality (27) 
Shows that the vectors x and y are linearly dependent, but we leave 


the proof to the reader. 
In the case 


(x, y) =0 (28) 


we have an important situation when the vectors x and y form an 
angle of 90° or, as it is often said, the vectors are orthogonal. If at 
least one of the vectors x or y is equal to 0 relation (28) necessarily 
holds and therefore the zero vector is regarded as orthogonal to any 
given vector although we can attribute any value to the angle which 
the zero vector forms with a given vector. 

A system of vectors a, b, ..., d is called orthogonal if all the 
vectors are mutually pairwise orthogonal and none of them is equal 
to the zero vector. Such a system is always linearly independent. 
In fact, if we take the scalar product of an equality of the form 
d = ga + Bb +... by a we receive 0 = a (a, a) which implies 
a = 0. Similarly, we conclude that all other coefficients on the 
right-hand side of the last linear combination are equal to zero 
which is impossible. 

It is most convenient to use an orthogonal basis in a finite-dimen- 
sional space, that is a basis which is simultaneously an orthogonal 
system. In fact, if Pi, ps, ..., Pa is sucha basis then any vector x 
is represented in the form x = Pi H aP +... + GnPn, and 
to find the coefficients of the representation it is sufficient to mul- 
tiply (in the sense of a scalar product) both sides by the vectors 
Pa (k = 1, 2, ..., n). By the orthogonality, this yields (x, p,) = 
= Œh (Prs Ph) which implies 


L (x, Pr) ae 
ad TE F (k=1,2, cyn) (29) 
If we had a non-orthogonal basis the above method would give 
a system of n equations of the first degree in n unknowns Œp. For- 
mula (29) becomes especially simple when the vectors which form 
an orthogonal basis are normalized (see Sec. 20). Indeed, then the 


i 
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right-hand sides of (29) have denominators equal to unity. An ortho- 
gonal basis of normalized vectors is called a Euclidean basis (com- 
pare with Sec. 9) or an orthonormal basis. 

An orthogonal basis can be constructed in any finite-dimensional 
Euclidean space. Actually, let, for example, (R) be four-dimensional 
and let qi, q2; q3 and q, be a basis in (R). We put py = qi and pz = 
= qə + ap; and try to choose « so that (pı, p2) should be equal 
to zero: (po, pı) = 0. This yields (q2 + op, pı) = 0, i.e. a= 
= —(qo, ps) (Po pı) ™. Then we put ps = qs + Bip: + Bap2 and 
choose B; and fs in such a way that (ps, py) = 0 and (ps, ps) = 0. 
This implies (check itl!) B; = —(qs, Ps) (Po Ps) * and Bz = 
= — (qs, P2) (P2 Pz). Finally, putting py = qa + Pipi + YoP2 + 
+ Yaps we determine the coefficients y, (k = 1, 2, 3) in such a way 
that (py, p) = 0, (Pa, Pp) =O and (py, ps) = 0. We leave the 
determination of ya (k = 1, 2, 3) to the reader. It is easy to show 
that by the linear independence of the vectors qr (k = "273," 4) 
all the denominators entering in this procedure are different from 
zero and this orthogonalization process can therefore be completed. 
The vectors py, Pz, Ps and py thus obtained form an orthogonal 
basis in (R). 

It follows that all the Euclidean spaces of the same dimension 
are isomorphic to each other. Let us discuss here this important 
corollary. Two given Euclidean spaces are called isomorphic if it 
is possible to establish a linear isomorphism (see Sec. 19) between 
them which preserves the scalar product, that is (x, y)p = (x’, yr 
(the subscripts R and R’ indicate here the spaces in which the scalar 
products are taken). Two isomorphic Euclidean spaces are indistin- 
guishable from the point of view of the theory of Euclidean spaces. 
It is obvious that two isomorphic Euclidean spaces are of the same 
dimension. But it is easy to show that, conversely, two Euclidean 
spaces of the same dimension are isomorphic. To prove this we 
choose an orthonormal basis in each of the spaces (by the way, if 
we normalize orthogonal vectors their orthogonality is preserved). 
Let these bases be pi, Po, -- +» Pn and pi, py, .--, Pn, respecti- 
vely. Now we establish an isomorphism by using the procedure 
described in Sec. 19: 


aapi H GPs +- - - + AnD <> 4P, + Pa +--+ H OnPr 


fee if x =ap,-+...+ opp, and y= Bipi +... + Bapn 
then 


(x, yr = Bi He ++ OnBa = (x, yr 


(why is it so?). In particular, we conclude that every finite-dimen- 
sional Euclidean space of dimension n is isomorphic to the vector 
space En. The same is also true for complex spaces. 
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§ 7. Vector Functions of Scalar Argument. Curvature 


22. Vector Variables. Let us return to usual geometrical vectors. 
Vector variables are closely related to scalar variables. Indeed, 
if, for example, we introduce a Cartesian coordinate system and 
write 

u = Uyi + u,j + uk (30) 


then it is seen that every change of the vector quantity u reduces 
to certain changes of the scalar quantities Ux, Uy and uz. We shall 
give only a few simple remarks. 

A vector quantity is infinitesimal (we denote this as u— 0) 
if its absolute value |u | is an infinitesimal variable, i.e. u — 0. 
But the direction of the vector u may change in an arbitrary 
way in such a process and may not have a limit. 

The vectorial limiting relationship u —> a (where a is a constant 
vector) is equivalent to the three scalar relationships u,— dx, 
Uy — ay and u,—> a;. In addition, if a4 0 then u tends to a not 
only in its absolute value but also in its direction. The theorems 
on the limits of a sum, of a product and the like (see Sec. III.5) 
remain true here and their proofs hold too. But, of course, the pro- 
perties of limits which are connected with inequalities do not apply 
to vectors since, as it was agreed before, we do not consider the 
notion of an inequality for vectors in our course. 

23. Vector Functions of Scalar Argument. We say that there is 
a vector function of a scalar argument u = Í (t) if to each value 
of the scalar variable ¢ there corresponds, by a certain law, a value 
of the vector quantity u. 

According to the beginning of Sec. 22, the determination of one 
vector function is equivalent to the determination of three scalar 
functions (understood in the ordinary sense) because ux = fx (t), 
Uy = fu () and u; = fz (i). 

The concept of continuity of a vector function of a scalar argu- 
ment is introduced in the same way as for scalar functions. Further, 


Fe gif hp ee Aan ae Eats NO sett) 
os Str Ae a ea 


All the basic properties of a derivative (see Sec. IV.4) are transferred 
(together with their proofs) to the case of vector functions. But it 
is necessary to stress that while applying these rules to differentiating 
a vector product we must pay attention to the order of factors: 
(u x v) =u’ X v-+u xv’. The concept of Taylor’s series is 
also extended to vector functions (see Eq. IV.52). 

The following property is sometimes of use: if |u (¢) | = const 
then u’ | u. Indeed, the condition can be written in the form 
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u-u = u? = const. Now differentiating we obtain uu + uu’ =0 
and 2u’u = 0 which implies u’ L u. 

To illustrate the geometrical meaning of a vector function of 
a scalar argument let us draw the vector u from a fixed point O 
in space. Then it is natural to regard u as a radius-vector (see Sec. 9) 
and denote it by r, i.e. 

rf (2) (31) 


As ż varies the terminus of the vector r describes a curve (L) in 
space (see Fig. 172). We can therefore regard (34) as a vector-para- 
metric equation of the curve (L). But if we introduce Cartesian axes 


Fig. 172 Fig. 173 


it is easy to pass from (31) to scalar-parametric equations of the form 
z=), y=, 2=x%1 (32) 


which determine the same curve. The right-hand sides of (32) are 
the result of projecting the function f (2) on the coordinate axes. 
{Compare this with parametric equations (II.10) of a plane curve.| 
If a space curve is originally defined by equations of form (32) then 
we pass to form (31) according to the formula r = ọ ()i +p i+ 
+ y (t) k. It is convenient to interpret the argument ¢ as time. Then 
the curve (L) may be regarded as a trajectory of a moving point. 
This curve is also called the hodograph of the vector function u. 

For example, as it is shown in Fig. 173, the equation r = a + bt 
where a and b are some constant vectors determines a straight line 
and describes a uniform rectilinear motion along this line with the 
velocity b. Projecting this equation on the coordinate axes we arrive 
at the parametric equations of the straight line: 


=a, + bst, y=ayt byt, 2 =a, + dit (33) 


As another example, let us take the equation of a screw line (cir- 
cular heliz). This curve can be obtained as a superposition of two 
motions of a point: the uniform motion along an axis and the uniform 
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rotation about the same axis (see Fig. 174). We choose the z-axis 
as the axis of revolution and denote the speed of the rectilinear 
motion by v and the angular speed by œ. Then we have 


æ= R cos ot, y=Rsinwt, z= vt 
or, in the form of a vector equation, 
r = R (cos œt i + sin wt j) + vik 
If the variable ¢ receives an increment Aż the point M on the 
curve (L) passes to the position N (see Fig. 172). Hence, Ar = MN ; 
The ratio = (which is a vector 


representing the average velocity) 
also lies-on the straight line MN 
because Aż is a scalar. When 


At>0O we have £ >) = 


(where v is the instantaneous 
velocity). Let us prolong the 
secant MN and watch the change 
of the position of this straight 
line as At— 0, i.e. as N — M. 
The secant rotates about the 
point M in this process and tends 
to the position of the tangent to 
Fig. 174 the trajectory at the point M. 
Hence we see that the instanta- 


neous velocity vector v = E is directed along the tangent line to 


the trajectory, this fact being well known in mechanics. The vector- 
parametric equation of a tangent line drawn for a certain value 
t = tọ has the form (see Fig. 173) i 


r = ro + Vo (t — to) 


where rp = f (to) and vo = f’ (to). We can rewrite this equation as 
follows: Ar = df. But if we take the equation r = f (ż) of the curve 
(L) then Ar = Af. Thus we see that the replacement of Af by df 
is equivalent to the replacement of the motion along the curve (L) 
by the uniform motion along the tangent line with the velocity 
equal to the instantaneous velocity at the given moment of time, 
that is by a motion which would appear if all the forces stopped 
acting on the moving point at this moment. 
According to Sec. III.9 (see example 1) we have, as As — 0, 


As ae 
Ar ds 


A 
vied, Ea REN 


j A 
=lim| 2 |=1, |ar|=|ds| (84) 
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Here we take the absolute values of As and ds since these quantities 
can be both positive and negative. Let us choose now the arc length 
s aS a parameter varying along the curve (L) and reckon it from 
a point Mo of the curve, that is take the equation of the curve in 


the form r = r (s). Then we see that the derivative 2 is a unit 
vector in the direction of the tangent to (L), that is the unit tangent 


vector (see Sec. 7). This vector is usually denoted by the letter 
q, io, © =x, Consequently, we have 


ds 
dr i 
dr dr ds ‘ad or 
amined ae AT Ta A (85) 
dt 


We remark in conclusion that formula (34) implies the following 
expression for the differential of the arc length in Cartesian coor- 
dinates: 


ds= + |dr|= +|d(xi-+ yj+zk)|= + | dxi + dyj + dzk | = 
=4 V d4 dy 4 dà 
24. Some Notions Related to the Second Derivative. Sirce 
| x (s) | = 4 = const we have T 1| t (see the beginning of Sec. 23). 
The straight line pp (see Fig. 175) drawn through a moving pcint 


(L) 


a iTl=|T+At|=1 
Fig. 175 Fig. 176 


M of the trajectory (Z) and parallel to x is therefore a normal to 


(L) (a perpendicular to the tangent ll at the point M). There exists 
an infinitude of normals drawn to a curve in space at each of its 
points. At each point these normals form a plane which is called 
the normal plane. To distinguish the normal in the direction of the 


vector a from other normals at a given point M we call this normal 
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the principal normal of the curve (Z) at the point M. The length 
(the absolute value) of the vector s is called the curvature of the 


curve (L) at the point M and is denoted by the letter k, i.e. z = 


=k and = = kn where n is the unit vector in the direction of the 


principal normal. 
The geometrical significance of the curvature is shown in Fig. 176: 
a 
2sin > 
dt s Aphia BC oy. ean G. 
=| |=1im| 3 |= lim 2 = lim aE = lim i 
(in the last passage to the limit we have used property 4 from 
Sec. III.8). 

Thus, the curvature is the speed (the angular speed related to 
the unit of the distance passed over) with which the tangent to the 
curve rotates. Incidentally, we see that the vectors t, and n go in 
the direction of the concavity of the curve. 

Differentiating the first formula (35) with respect to ¢ we obtain 
dr dv dt dv dt ds dv ‘ 
ae Ca a ea a ae Te 

This formula is widely applied in mechanics since if ¢ is the time 


S à 2 
of motion the formula shows that the acceleration vector pas can be 


at and v’kn. at is the tangential 


component since it has the direction of the tangent and v*kn is the 
normal component (directed along the principal normal). 


resolved into the components 


2 
Thus, the vector a drawn from a point M must necessarily lie 


in the plane passing through the tangent and the principal 
normal drawn from this point. This plane is called the osculating 
plane of the curve (Z) at the point M. Applying Taylor’s formula 
(IV.50) to f (to + Az) we conclude that the curve (L) may be regar- 
ded in the vicinity of its point as lying in its osculating plane with 
an accuracy up to infinitesimals of the second order (relative to Ad) 
inclusive. (It can similarly be shown that by formula (IV.49) the 
curve (L) coincides with its tangent to within infinitesimals of the 
second order relative to At as Aż — 0.) 

Hence, the osculating plane of the curve (Z) at the point M may 
be regarded as a plane passing through three points of the curve 
(L) lying infinitely close to the point M (just as we regard the tan- 
gent line as a line passing through two points of a curve which are 
infinitely close to each other). 

25. Osculating Circle. Let a curve (L) in a plane be represented 
parametrically (see Sec. II.6): z= <x (t) and y = y(t). Then, as 
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it is seen in Fig. 177, the curvature ķ of the curve at any moving 


dare tan s 


point is equal to | |= |—r where the dot denotes the 


differentiation with respect e3 the parameter. Taking the expres- 
sion of ds obtained in the end of Sec. 23 and performing some trans- 


Y4 (K) 


Fig. 177 Fig. 178 


tan p= (see Sec. IV.9) 
n 


formations we receive 


ka|- a): VF 


ware z 
oy 
T 


R (y z—y zx) dt 
my 


Vaya! (36) 


In particular, if the equation of a curve is represented in the 
form y= f(z), that is the argument x itself is regarded as a para- 


meter, then y= ide y= Vata z=0 and 


Saler- A 
k a-y) ( 7) 

Fig. 178 shows that for a circle 
k=| | =1im| 5% |= lim A =4 (38) 


which means that the curvature of a circle is constant and inverse 
to its radius. The only plane curves with constant curvature 
are circles and straight lines; the curvature of a straight line is 
equal to zero. 
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Let us take an arbitrary point M on a curve (L). The circle (KY 
passing through M and having the same tangent, the same curvature 
and the same direction of convexity as (Z) is called the osculating 
circle (the “circle of curvature”) of the curve (L) at the point M 
(see Fig. 177). The radius and the centre of this circle are called the 
radius of curvature and the centre of curvature, respectively. Accord- 
ing to formulas (36)-(38) we have : 


Baca WN | ae 


CESURA E M 
i Kosculating circle = kcurve (L) = dp 


[ye—ye) 

A curve (L) is very close to its osculating circle near its point M. 
On the basis of Taylor’s formula (IV.50) we can show that the 
curve (Z) can be regarded as coinciding with its osculating circle 


(39) 


Fig. 179 


in the vicinity of the point M with an infinitesimal error of the 
third order relative to Aż. This assertion, in its turn, implies that 
the osculating circle may be regarded as a circle passing through 
three points of the curve (Z) lying infinitely close to each other. 
The last property also applies to curves in space. In particular, it 
straightway implies that the osculating circle lies in the osculating 
plane. 

The points of a curve at which the curvature assumes its extremal 
values (but not the points of inflection) are called the vertices of 
the curve. For instance, take the parametric equations x = a cos t 
and y = b sin ż¿ of an ellipse. Then 


pE | (—dsin t) (—asin t)— (b cos t) (—a cos t) | 23 ab 
Şi [(—a sin t)2 + (b cos t)2]°/? (a? sin? t-+-b2 cos? t)3/> 


Here the denominator has extremal values at ¢ = 0, ¢ = p A ea 


1-2 m etc. (check it up!). Therefore the vertices of an ellipse 
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understood in the new sense coincide with the vertices defined in 
Sec. I1.10. The radius of curvature is equal to £ at the points of 


2 
intersection of the ellipse with the z-axis and is equal to F at the 


points of intersection with the y-axis. This result is applied to con- 
structing an approximate form of an ellipse: we draw the circles 
of curvature at the vertices by means of a compass (see Fig. 179) 
and then connect the circles by the lines drawn with the help of 
a French curve. For the sake of simplicity only a quarter of the 
ellipse is shown. The construction of the radii of curvature at the 
vertices is also demonstrated. 

To determine the coordinates Ẹ and y of the centre of curvature 
of a curve (L) at a point M let us suppose that the curveis convex 
downwards at M, that is y” >0 (see Sec. IV.20), as it is shown in 
Fig. 177. Then 


See A ENE EN L 
E=r—Rsing=z ENEE 
Leer Ne A E 

y” 1+y2 y" 

SS ; (40) 

ot Mees Vitae 

193/2 4 2 

(ity) epee 


vo Vite 
In case y"<0 we obtain the same formulas. If we pass to the 
derivatives with respect to an arbitrary parameter we obtain 


(check it!) 


poy PET | gk EE (41) 
ects che 


26. Evolute and Evolvent. A curve (Z) given, consider the locus 
of centres of curvature of the curve (L) and denote it by (Z). This 
locus is called the evolute of the curve. In its turn, the curve (L) 
itself is called the evolvent (or involute) with respect to its evo- 
lute (Z). 

Thus. if (L) is the evolute of (L) then (L) is the evolvent of (Z) 
and vice versa. It can be shown that we have the following rela- 
tionships between an evolute and its evolvent. 

(1) Let M be an arbitrary point of the evolvent and M the corres- 
ponding point of the evolute, i.e. M is the centre of curvature of 
the curve (L) at the point M. Then the straight line MM is not 
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only the normal to the evolvent but also the tangent to the evolute. 
This relationship is illustrated in Fig. 180. 

(2) Let a point M move along the evolvent. Then the increment 
of the radius of curvature is equal to the length of the corresponding 
arc of the evolute lying between the centres of curvature (see 
Fig. 181). 

These properties imply the following characteristic feature of 
an evolvent: if an unstretchable taut thread is wound off the contour 
having the form of the evolute the end of the thread describes the 


T) 


RrR=MMe 4) 


î) R;=M, Ñ 
MÈ, 7) Z RaM; Ma 
M(x,y) 
M 
Fig. 180 Fig. 184 


(D) is the evolute, (L) is the evolvent 


evolvent. The involute (evolvent) of a circle is of practical impor- 
tance because the profiles of the lateral surfaces of teeth of most 
gear wheels are shaped in the form of the evolvent of a circle. 

If a curve (L) is represented in the parametrical form x = z (t) 
and y=y/(t) then the parametrical equations of its evolute 
E = Ẹ (t) and y = ņ (ż) (where — and y are the coordinates on the 
same z- and y-axis) are obtained from formulas (44) by substituting 
the expressions of x and y in terms of ¢ into the formulas. 

To prove the two properties let us first differentiate equalities 


(40) (which hold if gr >0): 
dg = dx — dR sin ọ — R cos ọ dg, 


dyn = dy + dR cos ọ — R sin ọ dọ (42) 
But formula (39) implies 
ds 
dz = ds cos ọ = — cos ọ dp = R cos qd 
a os @ = p cos P dp p dp (43) 


and, similarly, dy = R sin ọ dp 
Substituting (43) into (42) we receive 
=—dRsing, d= dR cos ọ (44) 
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The same final result (44) would be obtained if we considered the 
case 2 <0. 


The immediate consequence of formulas (44) is that A= 
§ 
=—cot g = — =z . This means (see problem 5 in Sec. II.9) that 
dx 


the tangent to the evolvent at its arbitrary point M and the tangent 


to the evolute at the corresponding point M are mutually perpen- 
dicular (see Fig. 180) which is just 
the first property. 

From formulas (44) we also deduce 
d& -+ dy? = dR*. This implies (see 
the end of Sec. 23) dR = +ds where s 
is the are length reckoned along the 
evolute. Consequently, d (R = s) =0 
R F s = const, R = C +s, AR = +As 
and |AR |= |As |. This furnishes 
the proof of the second property of 
the evolute and the evolvent. 

It may be shown that to the verti- 
ces of a curve (see the end of Sec. 25) 
there correspond cusps of its evolute (see Fig. 182). For example, 
the evolute of an ellipse has four cusps. If a curve has a zero 
curvature at a point (in particular, such a situation usually occurs 
at points of inflection) the corresponding point of its evolute travels 
into infinity. 

As an example, let us determine the evolute of a cycloid [see 
formulas (II.42) in which we substitute ¢ for p]. Here we have 


Fig. 182 


z= R (t—sini), z= R (1—cosi), z= R sint; 
y =R (1 — cost), y = Rsin t, y = R cos t; 
22 + y? = R? (1 — cos 4)? + R? sin? t = 2R? (1 — cos t); 
fz — VE = R eos t-R (L — eos D — R sin 1-R sin t = 
= —R? (1 — cos 2) 
By formulas (41) we obtain 


in t-2R2(1—c 
E= R (t—sint)— R sin t-2R? (1 —cos t) 


Saua (t+ sin?) 


and, similarly, » = —R (1 —cos?). Now denoting t =n + rt 
we receive 

§=RiIn+7-+ sin (n + 1)] = R(t — sin 1) + aR, 

n = —R [1 — cos (a + t)] = R (1 — cos 1) — 2R 


1417—0141 
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Since nR and 2R are constants we see [comparing with formulas 
(I1.12)] that the evolute of a cycloid is the cycloid of the same sizes 
but translated ~R units of length in the positive direction of the 
z-axis and 2R units of length in the negative direction of the y-axis — 
with respect to the original cycloid. This fact is sometimes utilized — 
in engineering. R.. 

It also follows from Fig. 181 that an evolvent is a special case — 
of a roulette (see Sec. II.6) described by a point of a straight line 
when the line rolls upon the evolute. Incidentally, this result shows 
that the arc of any curve can serve as a roulette, that is roulettes 
do not form a special class of curves. 


CHAPTER VIII 


Complex Numbers 
and Functions 


Complex numbers are widely used in modern mathematics and 
its applications. It turns out that it is convenient to obtain many 
relationships between real quantities by using complex numbers 
and functions in intermediate calculations. 


§ 1. Complex Numbers 


1. Complex Plane. The definition of a complex number is well 
known from elementary mathematical courses. A complex number 
is an expression of the form 


zZ=2+ ly (1) 


where x and y are real numbers and i is the imaginary unit satis- 
fying the equality i? = —1; z is called the real part and y the ima- 
ginary part of the complex number z. For the real and imaginary 
parts we shall use the notation z = Rez and y = Im z (the term 
“imaginary part” is sometimes applied to the whole product iy 
which is more natural but less convenient). Two complex numbers 
are equal if and only if their real parts are equal and their imaginary 
parts are equal: if 2, = 2, + iy, and Z, = £, + iy, then the equa- 
lities 

g ETNA K 7 T, ra) 2 

; i Yi T Ye @) 
are equivalent. Hence, one “complex equality” is equivalent to two 
real equalities. Signs of inequality cannot be applied to complex num- 
bers, that is inequalities of the form z; > Z, do not exist. 

Complex numbers may be represented in a plane. To do this we 

take a Cartesian coordinate system z, y which enables us to repre- 
sent any number of the form (1) as a point M (z; y). Such a plane 
is conventionally called a complex plane but, of course, all the 
points of the plane have real coordinates. For brevity we often say 


47* 
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“the point z + iy” instead of “the point corresponding to the number 
x + iy” (see Fig. 183). 

Real numbers are a special case of complex numbers: if we put 
y = 0 in formula (1) we may regard a complex number z + i-0 
as)representing the real number z; real numbers are represented as 
points lying on the real axis (axis of reals), i.e. on the z-axis. Com- 
plex numbers that are not real are called imaginary. Thus, every 
complex number is either real or imaginary. A complex number 
without a real part (i.e. with the real part equal to zero) is called 


Fig. 183 Fig. 184 


pure imaginary; such numbers are represented by points of the 
y-axis which is called the imaginary axis or the axis of imaginaries. 
A queer thing in the terminology is that the number z = 0 is a pure 
imaginary number but not an imaginary number! 

It is often convenient to introduce polar coordinates in a complex 
plane (see Sec. II.3 and Fig. 183). The polar coordinates p and @ 
of the point M (x; y) which represents the complex number z = 
=x + iy are denoted as p = |z | and pọ = arg z. p is called the. 
modulus or the absolute value of z and @ is called the argument, or 
amplitude, or phase, of z. i 

As is known, p = V2? + y7, z = p cos m and y = p sin g. This 
implies, by (1), that 


z = p (cos ọ + i sin ọ) (3) 


Hence, each complex number can be written in the so-called tri” 
gonometric form (3). The modulus of a complex number is a certain 
uniquely defined non-negative real number whereas the argument 
is defined within an integral multiple of 2m. For instance, |i | = 4 


and Arg i = Ẹ + 2kn (k =0, +1, +2,...). The sign Arg z 
denotes the totality of all the possible values of the argument of 
a complex number z. Thus, Arg z has infinitely many different 


values. There is, however, one and only one value of Arg z, denoted 
as argz, which satisfies the inequality —180° < arg z < 180°; 
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arg z is called the principal value of the argument. Besides, any 
arbitrary value can be taken as the argument of the number z = 0. 

2. Algebraic Operations on Complex Numbers. To carry out the 
addition of complex numbers we should sum their real parts and 
their imaginary parts separately. Comparing this rule with the 
rule of addition of vectors [see formula (VII.10)] we see that the 
addition and subtraction of complex numbers are performed in the 
same way as the operations on vectors (see Fig. 184). In particular, 
it follows that 


lz +z |< lz |+ | 22 | 


Similarly, it can be verified that complex numbers are multi- 
plied by real numbers in the same way as vectors. These properties 
make it possible to interpret complex numbers as vectors in the 
complex plane. The connection between the representation of com- 
plex numbers as points of the complex plane and their vector repre- 
sentation is obvious: if the vector is drawn from the origin of the 
coordinate system its terminus is at the corresponding point repre- 
senting the complex number. 

The rule of multiplication of complex numbers is quite different 
from that of vectors. If we use the trigonometric form 


2, = pı (cos pı + isin Pi) Ze = pe (cos Qa + i sin Qa) 
then the product z = Z; *Zą can be written as 


z = p (cos p + isin g) = p; (cos p, + isin p1) pa (Cos Py + 
+ isin @2) = PiPs (COS Ọ;ı COS Pa + i cos g, sin Py + 
+ isin @, cos gp, — sin p; sin P3) = 
= pipe [cos (Pı + P2) + isin (Pi + Pə)l 

Therefore, 
: P= Pie P= M+ Ps 
i.e. 

| 2129 | zE | 24 | j | Za l; Arg (2, 2) = Arg zy ar Arg Zo 
Hence, when complex numbers are multiplied their moduli are mul 
tiplied and their arguments are added. This implies that for the inverse 
operation, the division, we have 


zl 


44 
| zal? 


22 
The multiplication of a complex number by i is of special inte- 
rest: |iz |= |z| and Arg iz = Arg z+ since |i |=1 and 


Arg = = Arg z; — Arg Z3 


arg i = +i thus, the vector iz is obtained by turning the vector z 
in the positive direction through a right angle. 
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The rule of multiplication of complex numbers extends automa- 
tically to an arbitrary number of factors. In particular, if we take 
equal factors then we have 

[p (cos p + isin p)” = p” (cos nm + i sin nọ) 
(T= Ano Osire ss) 
In case p = 1 we obtain the formula 
(cos p + i sin p)” = cos np + isin nọ 
which was named De Moivre’s formula after the English mathema- 
tician A. Moivre (1667-1754) who discovered the formula in 1707. 
De Moivre’s formula can be applied to express the values of tri- 


gonometric functions of multiple ares. For example, taking n = 3 
we get 


(cos p + i sin p)? = cos? p + i 3 cos? sin p — 
— 3 cos @ sin? p — i sin? pọ = cos 39 + isin 3p 
which implies, by formula (2), that 
cos 3p = cos? p — 3 cos ọ sin? p, sin 39 = 
= 3 cos? ọ sin pọ — sin? p 


Here, of course, we should take into account the table of powers 
of the number i: i! = i, # Ea i; i4 1,# Bete 


etc. By the way, note that 2 = —i. 


Now we turn to extracting roots of complex numbers. If z = 
= p (cos p + isin p) is given and j/z=w = r (cosp + i sin p) 
is to be found, then, by the definition of a root, 2 — WA 
=r" (cos mp + i sin mp). Comparing this with the original expres- 
sion of z we conclude (see the end of Sec. 1) that 


r" =p, mp = @-+ 2kxn (kis an arbitrary integer) 
Since r and p are non-negative numbers, we have r = GY Pora 
J k — 
and » = pas where (Ga P)ora denotes the “ordinary” (i.e. the 
arithmetic, real positive) root of a non-negative number. Thus, 
= 2k. sas k 
w= (W Pora (cos TEAT isin PEEN) 


Making k assume the values 0, 1, 2, ... we shall get all the 
values Wi, Wo, Wg, ... of the root. But for k=n we obtain 


Wns = W Plord (cos BEAM 4g sin ey sss 
= (1 Para | cos (-2 + 2x) +isin (£42) ] = 
= (7 Pora (cos T {isin P ) =w, 


n 
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Similarly, way = We etc. For negative k we do not obtain any 
new values either: k = —1 yields the same result as k =n — 1 
etc. Finally, 


[7 p (cos prisin )lr1 = 


nfo 2kz Nee k: 
= (WY P)ora (cos PEAK i igin BT?) (k=0, 1, ...,n—1) (4) 
We see now that the nth root of a complex number has n diffe- 
rent values; the number z = 0 makes the only exception to this 
rule since all the roots of 0 are equal to zero. 


i a p d ci ai MC 
For example, since 2i= 2 (cos >-+isin=> 


7 7 ) . we have 


Z pkn F Hkn 


(V 2i)1,2 = (V 2)ora le 2 +i et) (k=0, 1) 


which enables us to calculate (2i), =1+i and (/2i),= —1—i 
easily. i 
Consider another example: 


Fi ay 2 BeA 
(/1)1,2,3 = (V Dora (cos = isin) (k=0, 1, 2) 
since 1—1(cos0+isin0). This implies (1), =1, G/1)2= 
= ws ne: and (,/1)3= ati LE In this case one of the 


roots has turned out to be ordinary, real whereas the other two 
roots are imaginary. 

The geometrical meaning of formula 
(4) is illustrated in Fig. 185 where we 
have taken n = 9. 

3. Conjugate Complex Numbers. The ($2) 
number z* = x — iy is called conjugate fi 
to the number z = z + iy; we often write 
z instead of z*. The simple properties of 
conjugate numbers are the following: 

(1) @*)* = (x — iy)* = [s+ i (—y)I* 
=z —i(—-y)=x+iy=z, ie. the | 
numbers z and z* are mutually 
conjugate; Fig. 185 
(2) z+ 2* = 2Rez, z — 2* oy, Ares 
(3) z* =z if and only if z is real; 

(A) zt = (e — iy) (2 + y) = + y= 
=i e 

(5) |z* |= |z], Argz* = —Argz, i.e. the points z and z* 

are symmetric with respect to the real axis; 
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(6) (2, ++ Za)* = zf + z% since 
(2, F Z2)* = (t1 + ty: + 2a + iys)* = 
= [xy + za + i (y + y2)I* = % + Ta — i (YW + Ya) = 
= (x — iy) + (za — tye) = zi 4 
(7) (2:22)* = 2*2¥ which can be verified in the same way as pro- 
perty (6). 
If we substitute i for z, into the formula expressing property (7) 


z * 
we get z* = (=) zž which yields 
a\* z 
(8) (+) =i 
Properties (6) and (7) extend automatically to any arbitrary num- 
ber of summands or factors. For instance, 
et = e), 
(220 172") * = (229E E (7z™)* —.2 (2y — i (z*)™ . etc. 
Generally, to pass from any rational expression containing an 
arbitrary number of variables and coefficients to the conjugate 
expression it is necessary to replace each variable and each coefficient 
by its conjugate value. It can be shown that this rule holds not only 
for rational expressions but also for irrational expressions, for sums 
of power series and so on. It follows that each equality involving 
complex expressions of the above type remains true when —i is 
substituted for i everywhere in the equality because the substitution 
transforms the original equality of complex numbers into the equa- 
lity of their conjugate numbers. The numbers i and —i are there- 
fore indistinguishable in the algebraic sense. It would be incorrect. 
to say that i = V —1 and —i = 1. In fact the root y —1 
simply has two different values designated as +i. 
Conjugate numbers can be used, in particular, to separate the 
real and the imaginary parts of a fraction of the form 24 — 7! — ey 
Z% Ha iY 
To do this we multiply the numerator and the denominator by z* 
and obtain the fraction with the real denominator which enables 
us to perform the separation easily. 
For example, 
i. i! i ain te al 4 
Ro2t% _ pe CEDGHD pettäis 


3— (3 — i2) (3+ i2) 13 
4, .49 4 
= Re ( gta BB 


-4. Euler’s Formula. Now we turn to the transcendental opera- 
tions on complex numbers. In Sec. IV.16 we showed that for real z, 


wait atopy... (5) 


—-_ 
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If we substitute z for z we shall come to the definition of the exponen- 
tial function with a complex exponent: it is defined as 
z aid OEA G E . 
e =1} tar tait (6). 
We shall show in Sec. XVII.14 that this definition can be justified 
for all z and besides the basic property of the exponential function 


remains true: 
ene: = et tt: (7) 


Formulas (6) and (5) show that in the special case when z is real 
the new definition of e* coincides with the old one; in general, each 
new definition must not contradict the facts already established. 
At the same time formula (7) confirms the expedience of this defini- 
tion of e*. 

In the same standard way we can define, for the complex values 
of the argument, functions f (£) which were originally defined for 
real values of the argument. For this purpose we should expand 
a given function f (x) into Taylor's series in powers of x or in powers 
of z — a where a is a real number and then replace z by z and denote- 
the sum of the series by f (z). Thus. by analogy with (6), we write, 
using formulas (IV.56) and (IV.57), for complex z, 


= 35 zt 
esd ety | ae (8) 


2 z4 z, 
cosz=1— 5 ta i (9) 


etc. In Sec. XVII.14 we shall show that all the basic formulas which 
have the form of identical equalities and hold for real values of 
the argument (for instance, such as sin (—z) = —sin 7 or sin? z + 
+ cos? z = 1 etc.) remain true for the complex values of the argu- 


ment. 
The above formulas reveal the essential relationship between the 


exponential function and trigonometric functions. Namely, sub- 
stituting iz for z into (6) we deduce 


—ftit+...)=((-qetat—qet--)+ 


i ae 1 iL eg EA AP, 
ea (14 tie E E tt a a 


This, together with (8) and (9), implies Euler's formula 


e7 =cosz-+isinz (40) 
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-which is very important. The formula 
e`? = êC = cos (—z) + i sin (—2z) = cos z — i sin Z 

and the formulas 
eiz- e-iz i n ee a 
ae and sinz=—5; (14) 
-which are implied by the above formula and (10) are also often 
used. All the formulas were discovered by Euler in 1743. 

From Buler’s formula (10), using property (7), we obtain the follow- 
ing expression for the exponential function with an arbitrary com- 
‘plex exponent: 


cos Z= 


ey = etl = eXetY — e” (cos y + i sin y) (12) 


The comparison with trigonometric form (3) shows that 
je? | =e*, Arg e =y + kn (13) 


In particular, it is seen that we always have | e° | >Q, i.e. e #0. 
If we write z instead of e* in formula (12) then, by (13), we obtain 


z = |z | (cos arg z + isin arg z) = |z | etas: = pei? 


‘This “exponential form” of complex numbers is convenient for 
performing algebraic operations on them. 

Formulas (11) imply the following relationships between tri- 
gonometric and hyperbolic functions (see Sec. 1.28): cos z = cosh iz 
and sinz = ve © = that is sinh iz = isin z. From this, substi- 
tuting iz for z, we also deduce cos iz = cosh z and sin iz = i sinh Z. 

It is these formulas that reveal the essential relationship between 
the functions (the relationship was mentioned in Sec. 1.28) which 
enables us to transform relations between trigonometric functions 
into the corresponding relations between hyperbolic functions and 
vice versa. (Let the reader deduce the basic relation between cosh z 
and sinhz by substituting iz for z into the formula cos?z + 
+ sin? z = 1.) 

Using formulas (11) we can also obtain the expression of powers 
of sine and of cosine in terms of trigonometric functions of multiple 
arguments. For instance, 


Oi be eiX | e-ix \ 3 __ ef 8% 4 Beir Ze-ixt e-i 3x 
cos z= (SS ) A SEE os |< ean 
i3x -i 3x 3 ve 3. ¢ 

= HE (eip et = Se 4 es (14) 
etc. Transformations of this kind are used for integrating trigono- 
metric functions. 

5. Logarithms of Complex Numbers. The definition of “complex 
logarithms” is essentially the same as that for logarithms of real 
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numbers: the logarithm of a complex number z is the number w 
for which z = e”. To find the value of the logarithm let us denote 


z = p (cosp + isin ọ) w=u + ùv 
Then we deduce from formula (12): 
p (cos p + isin g) = 2 = e” = e“ (cos v + isin v) 
Since u and v are real this implies 

eë =p ie u=lnp; v= gt 2kx (k is an integer) 
where In p is understood as an “ordinary”, real logarithm of a posi- 
tive number. Thus, 

Inzg=w=u+iv=lInp + ig + itkn = ln |z |+ t Argz 
where Ln z denotes the totality of all the values of the logarithm. 

Hence, a logarithm of a complex number has an infinite set of 
different values. The number “zero” is the only exception to the 
rule because it has no logarithm. We can write conditionally (for- 
mally): Ln 0 = —oco + iv where v is arbitrary. 

Real positive numbers being a special case of complex numbers, 
their logarithms are also infinite-valued. One of the values is “ordi- 
nary”, real, whereas all the others are imaginary. For example, 

Ln 1 =Ini + i0 + i2kn = idkn . (k = 0, +1, +2, `. .) 
For k = 0 we get the original value ln 1 = 0 but at the same time 
we can take for the logarithm of 1 the values i2n, —i2n, idx etc. 
Let us check it up once again: 

eitkx — cos 2kn + isin 2kn =1+ i0 = 1 (15) 

(k = 0, +1, +2, ...) 
Negative real numbers also have logarithms but all their values are 
imaginary. For example, Ln (—1) = ia (2k + 1) (check it up!). 

By means of logarithms we raise a complex number to an arbi- 
trary complex power: the involution is defined as 

z732 = (ern z4)72 = e? Ln ži 
and the right-hand side is calculated by formula (12). Since loga- 
rithms are infinite-valued the whole power is also infinite-valued 
in the general case. 


§ 2. Complex Functions of a Real Argument 


6. Definition and Properties. It is sometimes necessary to deal 
with functions which assume complex values although their inde- 
pendent variable, the argument, is real. As examples of such func- 
tions we can take 


i (t)2=(+0% (@) z= Me" (p =a + io) ete. 
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Here the independent variable is denoted by ¢ and the functions 
are denoted by z. If we decompose the value of such a function into. 
its real and imaginary parts, i.e. z = x + iy, then each of the parts 
will be a function of ¢; thus, in the above examples we obtain 


()2=Pf—3t, y=3F—1; (2) « = Me cos ot, 
y = Me sin wt 


(verify these relations taking into account that M is reall!). 
In the general case, if 


z = f (ġ = op) +i (16) 
we obtain 
x = g (2), y = (é) (17) 


Conversely, (17) yields (16). The indication of a complex function 
of a real argument is therefore equivalent to the indication of two 
ordinary, real, functions of the argument. This situation is quite 
analogous to the case when we have a vector function of a. scalar 
argument (see Sec. VII.23). The analogy becomes still more complete 
if we interpret complex numbers as vectors (see Sec. 2). 

It follows that the theory of complex functions of a real argument 
does not involve any essentially new features in comparison with 
the theory of real functions. In particular, the definitions of the 
continuity, of the derivative etc. are transferred without any changes. 
From the considerations in Sec. XVII.14 it follows that all the 
differentiation formulas remain true. For instance, 


[(t + il = 3 (t +i), (Me) = Mpe and so forth 


A function of form (16) is represented by a curve in a complex plane 
which has parametric equations (17). 

When using functions of form (16) the following obvious pro- 
perties should be taken into account: 

if complex functions are added up their real parts and their ima- 
ginary parts are added separately; 

if a complex function is multiplied by a real constant or by a real 
function the real and the imaginary parts are also multiplied by the 
same factor; 

if a complex function is differentiated the same operation is per- 
formed on its real and imaginary parts. 

The properties are expressed by the formulas 


Re [fı (t) + fa ()] = Re fı (t) + Re fa (t) ete. 


(Deduce the formulas!) 
These properties make it possible to perform the above opera- 
tions on the whole complex function and then to take the real or 
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the imaginary part of the resulting expression instead of performing 
the operations on the real or on the imaginary part. It is remarkable 
that such a transition to complex quantities together with the reverse 
transition to the real quantities which are sought for sometimes 
turns out to be more obvious and simpler than the corresponding 
direct operations on the real quantities. 

Here we also mention a function of the form 


Ln [¢—(a-+ ib)] = In| t—a —ib|+-é Arg [t—a— ib] = 


=4 In [(t— a)? +b°] +i [ are tan" = + ! kat | 
The function is sometimes of use for certain applications when the 
integer k is chosen in an appropriate way. 
7. Applications to Describing Oscillations. Let us take the function 


U (t) = Mei t+) = M cos (wt + a) + iM sin (ot + a) 
(M >0, o > 0) 


This function is convenient to apply for investigating harmonic 
oscillations (compare with Sec. 1.29). For this purpose it should be 
noted that expression (18) has the modulus M and the argument 
wt + q, i.e. it is represented by a vector of constant length which 
rotates uniformly with the angular speed o. : 

For example, let us consider the superposition of oscillations 
with equal frequencies. Let it be necessary to sum up the two quan- 
tities u, ( = M; sin (ot + a) and us (t) = M, sin (ot + aa). To 
do this we introduce the corresponding complex quantities U, (= 
= Myei tta) and U, (t) = Me “ot+o2), u; and u, being, res- 
pectively, their imaginary parts. The vectors U, (i) and U, (2) 
rotate uniformly with the angular speed œ and therefore the vector 
U, (t) + U, (t) rotates uniformly with the same speed. Hence this 
vector can be represented in form (18). To find M and g it is suffi- 
cient to consider the situation at the moment ¢ = 0 (see Fig. 186). 
The figure shows that projecting on the coordinate axes we obtain 


M cosa = M, cos a, + M, cos a, 
M sin a = M, sin a + M, sin a, 


(18) 


(19) 


Taking the imaginary part of U (t) we finally conclude that u, (4) + 
+ u, (t) = M sin (ot + a) where M and a should be found from 
equalities (19) (find them and explain the corresponding formula 
for M by means of Fig. 186). 

A similar result follows when we have the superposition of an 
arbitrary number of harmonic oscillations with the same frequency. 
The superposition of oscillations having different frequencies will 
be discussed in Sec. XVII.23. 
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The advantage of exponential function (18) over the trigonometric 
functions is particularly well seen when we differentiate: 


W = Mei (t+-2j@ = iol (20) 
By Sec. 2, we again obtain a vector which rotates uniformly with 
the angular speed œ but it leads U by 90° and has a modulus which 
is œ times that of U. The rotation and the stretching are repeated 
when the differentiation of U is continued. 

` Now we are going to demonstrate an application of functions of 
form (18) to an electric circuit shown in Fig. 187. There is a resi- 
stance R and an inductance L in the circuit. If an alternating 


Fig. 186 Fig. 187 


potential difference which varies according to the law ọpọ = 
= o Sin (wt + B) is applied to the terminals of the circuit there 
appears a steady-state alternating electric current flow also varying 
harmonically: j = jo sin (ot + «) but jọ and œ are not known be- 
forehand. Equalling @ to the resulting voltage drop on R and L we 
deduce, using the well-known physical laws, the basic equation 


of our problem: Rj + L 4 = f. 

Let us introduce the notion of a complex voltage and that of a 
complex current by formulas © = ge! t+6) and J = jot (otto), 
The “real” voltage and current are equal to the imaginary parts 
of the corresponding expressions. By the properties described in 


Sec. 6, to find j we must solve the equation RJ + Eo = 0 and 
then take the imaginary part of the resulting expression. According 
to formula (20) we receive 


; ; PRD 
RJ+LioJ =, i.e. J = pret 


(21) 
We see that the inductance L can be interpreted as a certain resi- 
stance equal to ioL; this quantity is called the impedance of the 
unit L. Writing the expression (R + i@L)-1 in the exponential form 
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(R + ioL)™ = rei® we deduce from (21) (verify it!): 


1 
jo = roo = po (R? + wL4) 7, œ= f — arc tan o (22), 


(Let the reader try to derive the expression for the vector joet” = 
= J \s<o by using the first equality (21) and the geometric method 
and then deduce formula (22) from the expression.) 


§ 3. The Concept of a Function of a Complex Variable 


The theory and the applications of complex functions of a com- 
plex variable contain many new ideas and facts in comparison with: 
the theory of functions of a real argument. The theory is dealt with 
in many books. To a beginner we should recommend [15], [40], 
[44, Vol. 8, Part 1]. Here we shall give only some simple facts which 
are directly related to the subject matter of our course. 

8. Factorization of a Polynomial. Let us take an entire rational 
function of the argument z = x + iy, that is a polynomial 

P (2) = ag” + azt + 2... 4+ ana H a, (a) Æ 0) (23) 
of the nth degree with certain coefficients ao, . . ., an which can be: 
complex in the general case. In the books mentioned above the 
reader can find the proof of a remarkable theorem, namely the “funda- 
mental theorem of algebra” which asserts that every polynomial of 
degree n> 1 has at least one complex zero, i.e. there exists a root 
of the equation 

P (2) = 0 (24) 
which can be either real or imaginary. (The theorem was proved by 
D'Alembert in the middle of the 18th century. A more rigorous 
proof was given by Gauss at the end of the 18th century.) If we denote- 
one of the roots by z, then, as it is proved in elementary courses: 
on algebra, P (z) is divisible by the binomial z — z,, that is P (z) = 
= (z — z,) P, (z) where P, (z) is a polynomial of the (n — 1)th 
degree. If we repeat the argument for P, (z) we shall get P, (z2) = 
= (z — 2) P, (2), i.e. P (z) = (2 — %) (2 — 29) P, (z) where P, (z) 
is a polynomial of degree n — 2. These considerations can be con- 
tinued up to the “polynomial of degree zero”, i.e. a constant, and 
thus we receive 
P (2) = a (z — 2) (@ — 29)... (2 — an) (25). 
This formula shows that all the numbers Z, Zo, ..., Zņ are zeros: 
of the polynomial P (z) and that it has no other zeros. 

Thus, an algebraic equation of form (24) of the nth degree has. 
exactly n roots. 

Some of the roots of equation (24) may coincide, i.e. they may be 
repeated. Such roots are called multiple (double, i.e. of multipli- 
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city 2, or triple, ie. of multiplicity 3 etc.) in contrast to simple 
roots which are not repeated. When figuring the number of roots 
we should reckon each root according to its degree of multiplicity. 

A simple test for distinguishing the multiplicity of a root is 
implied by Taylor’s formula for a polynomial (IV.46) which, as we 
shall show in Sec. 11, remains true for a polynomial in a complex 
variable. Suppose z = a is a root of equation (24). Then P (a) = 0 
and the formula implies that if 


P (a) = P' (a) =... = Pl) (a) = 0, P® (a) 0 (26) 


then the polynomial P (z) is divisible by (z — a)" and is indivisible 
by z — a to any power higher than k. The value z = a is therefore 
a root of equation (24) of multiplicity k. 

If P (2) is an arbitrary function, even a transcendental one, and 
relations (26) hold for a value z =a then z =a is also called 
a root (a zero) of equation (24) of multiplicity k. (In parti- 
cular, if P (a) =0 and P’ (a) ~0 the value z =a is a simple 
root.) In the case of an arbitrary P (z) we conclude, by Taylor’s 
series (IV.53), that for a k-tuple root z = a the ratio P (z)/(z — a)" 
has a finite limit, as z > a, which does not equal zero. Hence, the 
ratio has a removable discontinuity at z = a (see Sec. III.13). Thus 
we can write 


P= eat FEE = (6 —a)* Q(2) 


where the function Q (z) remains continuous for z = a and Q (a) #0. 
For example, the value z = 0 is a triple root of the equation 
z—sinz = 0 
since (z — sin g) |,—) = 0, (1 — ‘cos 2) |x= = 0, (Sin z) |x =o = 0, 
and (cos 2) |x=o = 1. 
If we combine equal factors in the factorization (25) we obtain 


P (2) = ay (2 — %)™ =... (Z — Zp)? (27) 
where Zi, ..., 2, are all the pairwise different roots of equation 
(24) and a, ..., a, are their multiplicities. Factorizations (25) 


and (27) hold for polynomials (23) with real and complex coefficients 
as well. 


If polynomial (23) has real coefficients then, together with every 
complex root, it has the conjugate root of the same multiplicity. 
Indeed, if P (2m) = 0 then [P (z,)]* =0* = 0. But, by Sec. 3, 


[P (8m)]* =,[ao2m t «+. ++ a_]* = at (am) +... + an = 
= Gy (2h) +... +a, =P Ch) 
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from which it follows that z = zħ is also a solution of equation (24), 
In case z = Zm is a double root of equation (24) we have, in addi- 
tion, P’ (zm) = 0 which implies, in a similar way, that P’ (2%,.) = 0 
and so on. 

Combining the factors which correspond to a pair of mutually 
conjugate roots of the form « + if in factorization (27) we obtain 


[z — (a + ip)] [z — (a — if)] = (z — a)? + p? = 
=z 4+-pz+q (p = —2a, q = a? + Bp?) (28) 


Such combinations are used for factoring polynomials with real 
coefficients depending on real argument. It is natural to denote the 
independent variable as z, and thus we obtain from (27) the expres- 
sion 


P (a) = ao (£ — 24)% . . . (@ — a) (a? + pir Fa). 
-e e (2? + pst + Gq) Be (29) 


where the first r parentheses correspond to the real roots and the 
last s parentheses correspond to s pairs of conjugate imaginary roots. 
Since p and q are real we conclude that every real polynomial (i.e. 
a polynomial with real coefficients) can be factored into real linear 
and quadratic factors. If all the roots are real there are only linear 
factors in factorization (29) and if all the roots are imaginary then 
there are only quadratic factors in it. The exponents a,, ..., Gr, 
By, --. Bs are equal to the multiplicities of the corresponding 
roots; in particular, they equal unity for simple roots. 

9. Numerical Methods of Solving Algebraic Equations. To realize 
factorizations (27) and (29) it is necessary to solve equation (24), 
that is an algebraic equation of the nth degree. The solution for 
n = 2 is well known from elementary mathematical courses. In 
textbooks on higher algebra there are formulas for the solutions for 
n = 3andn = 4 which were discovered as early as the 16th century. 
But the formulas are so complicated that they are almost never 
used for practical purposes, especially for n = 4. In case n > 4 
there are no general formulas which express solutions in terms of 
the coefficients of the equation by means of algebraic operations 
on the coefficients. The non-existence of such formulas was proved 
by Abel and by É. Galois (1811-1832), a French mathematician, 
who created the fundamentals: of modern algebra. 

But algebraic equations can be solved approximately to within 
any degree of accuracy! In § V.4 we described some methods of 
calculating real roots of equations of form f (x) = 0. To find ima- 
ginary roots we can use Newton’s method [see formula (V.7) in 
which we may regard x as an imaginary quantity] or an iterative 
scheme (see Sec. V.3). It is also possible to substitute z = x -+ iy 
into equation (24) and then separate the real and the imaginary 


18—0141 
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parts: 


P (z) = P (z + iy) = Q (z, y) + iR (z, y) 
This reduces equation (24) to a system of equations of the form 


Q (z, pact 
Rw, y) =0 


which can be solved with the help of methods discussed in 
Sec. XII.12. These methods are also described in [3], [6], [40], 
[33] and [42]. 

There exist methods that are only applicable to algebraic equa- 
tions. We shall give here a method which was introduced by several 
authors and, among them, by N. I. Lobachevsky in 1834. 

Every algebraic equation can be written in the form 


at az "t+ az"? + .,,+a, =0 (80) 


with the coefficient in the highest power equal to unity (why is it 
so?). Separating the even and the odd powers we get 


Be age? = at — ag —. E 


If now we square both sides of the last equality we obtain an equation 
containing only even powers of z. Therefore, denoting z? = p we 
receive an equation of the nth degree for p (why is it so?) whose roots 
are the squares of the roots of equation (30). If we then transform 
the equation in like manner and put p? = q we shall arrive at an 
equation of the nth degree with the roots equal to the fourth powers 
of the roots of equation (30) etc. 

After several transformations of this type the roots with the 
greatest moduli become the most important. For instance, if equation 


(30) has the roots z, = 2, z, = —1 and z, = + the next equation 


has the roots py = 4, pą = 1 and p, = 4. The equation following 


the above equations will have the roots q, = 16, gg = 1 and qa = a 
etc. After m transformations are carried out we arrive at an equation 
of the form 


w CT Heen +c, =0 (31) 
and its roots, yet unknown, are equal to v, = 23", 2. ., Vn = 22. 


Therefore, by (25), equation (31) has the form 
(Vv—2™)...(v— 22") =0 


Now let us substitute v = 10'v into equation (31) and let us 
choose the integer / in such a way that in the resulting equation 
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for v, after the division by the highest coefficient 10"', the greatest 
of the moduli of the coefficients in v”, v"-*, ..., 1 should be 
of the order of 1, that is the greatest modulus is neither too large 
nor too small. Then, since the equation thus obtained, after fac- 
toring, must have the form 


(v—v)...@—v,) =0 (where v, = 10-'v, = 
= 10-'z2” etc.) (32) 


we conclude that the greatest root (or roots) of equation (32) is 
(are) of the order of 1 whereas all the other roots are negligibly 
small, Omitting these roots, i.e. equalling them to zero, we thus 
delete in the equation for v the terms with too small coefficients. 

Solving the equation for v obtained after the deletion we find 
approximate values of the roots having the greatest moduli. Then 
turning back to z we receive approximations to the roots of equation 
(30) with the greatest moduli. The greater m, the greater the accu- 
racy of the approximations. There appears a difficulty in the transi- 
tion from v to z connected with the extraction of a root with the 
index of radical 2”. If equation (30) has real coefficients and if we 


get only one root with the greatest modulus in the equation for v 
then equation (30) will also have only one root with the greatest 
modulus and it will be real (why is it so?). Hence, in this case we 
can limit ourselves to calculating only two real values of the root. 
But if these conditions do not hold we have to extract the root 
according to the rules of Sec. 2 and thus obtain many possible values 
of the root. To determine which of the values should be taken we 
may substitute them all (in succession) into equation (30) and thus 
verify which of the roots satisfies the equation in the best,way. 
It is also possible to use the well-known relations between the 
roots of an algebraic equation and its coefficients. To deduce these 
relations it is necessary to compare the coefficients in equal powers 
of z in formulas (23) and (25) removing the parentheses in (25). 
This yields 
a 
Zi- 22+ Jo aes 
2120 + Z123 -H +++ F Zn-12n => 
(all the possible products of two factors) { (33) 


oke 2) OS. N & ose © 6.46 8 8 8 EH 8 


Life sie = ae ee 


After the root v, of equation (31) has been found it is possible 
to eliminate the binomial v — v, from the equation by dividing 


18* 
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the equation by the binomial. Then in like manner we can find the 
next root etc. It is often possible to find the roots by applying rela- 
tions (33) to equation (31). Doing this we should take into account 
that in case the roots of equation (31) differ greatly in the values of 
their moduli it is permissible to neglect certain summands in the 
left-hand sides of relations (33). 

When the roots of equation (30) are determined approximately 
it is possible to apply some of the iterative methods described 
in § V.1, for example, Newton’s method (Sec, V.2), to make the 
approximations more accurate. 

Now lét us consider as an example the equation zè + z2? — 3 = 0 
already solved in Sec. V.2. The successive transformations according 
to the Lobachevsky method yield 


p—p?+6p—9=0, gq? — 11g + 189 — 81 = 0, 
u? — 85u? + 2106u — 6561 = 0 


(verify it!). Here we stop the process and make the substitution 
u = 10'a which results in 


-s 85-2 - 

u ——u SUTEN (84) 
10? 108 

In this case we must take Z = 2 (verify it by choosing another value 


of 1). Then the absolute term in equation (34) becomes small and 
may be omitted. This implies 


a — 0.85 -+ 0.2106 = 0 (35) 


Hence, ū, and āū, and therefore z; and Za, are pairwise conjugate 
imaginary numbers. Further, since %,u, = 0.2106 we have uju, = 
= 2406 and, by the equality (z,z,)° = Uiu, we deduce 

Zza = | 21,9 |? = $7 2106 = 2.60 
But 2,252, = 3 (why is.it so?), i.e. Z = a 14.45. 

Besides, 2, + Za + Z = —1 (why is it so?). Therefore if we put 
Z,a =œ + if then 2g + 1.15 = —1 which implies œ = —1.08. 
But a? + P? = | Z, a |? = 2.60, that is B = 2.60 — 1.08? = 1.20. 
Thus, approximately, 2:,. = —1.08 + 14.20 and z = 1.15. 

Iterations by Newton’s formula (V.6) yield more accurate values 

Zi, = —1.087 + i1.172 and z = 1.175. 
The Lobachevsky method is especially convenient when the roots 
of the equation are real and different in their moduli. It is some- 
times possible to simplify the calculations. For instance, in solving 
the above problem it was possible to limit the calculations to finding 
only Z because after dividing by z — Z we can reduce the problem 
to a quadratic equation. 
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The Lobachevsky method and some similar methods are treated 
in detail in [22], [28], [38] and [50]. 

10. Decomposition of a Rational Fraction into Partial Rational 
Fractions. Remember that a rational fraction (see Sec. 1.17) is 
a ratio of two polynomials: 

Q(z) boM- . . + bm 

I= P ag F Ga (36) 
If m< n the fraction is called proper (no matter what the values 
of the coefficients are) and it is called improper if otherwise. An 
improper fraction can always be represented as a sum of an entire 
rational function (i.e. a polynomial) and a proper fraction. For 
instance, we can achieve this by dividing the numerator by the 
denominator according to the usual rule of division of polynomials. 
For example, 


Ti 
z3 3z z2—1 1 2 
arita mys 2 tms 
and so forth. 
It is important to remark that a sum of proper rational fractions 
is also a proper rational fraction whereas this rule does not hold 
for fractional numbers. To prove this let us mark off the degree of 


a polynomial by the subscript. Then we have 


TEO Qm (2) P- eH (2) Pa 0) 
Qm (2) m i n ze m 
Pr (2) P- Om Pr, (2) P- (2) 


If the fractions on the left-hand side are proper then we havem < n 
and m< n. But in this case the first summand in the numerator 
of the expression on the right-hand side is of degree m ap n<n+n 
and the second summand is of degree m +n<n +n. The whole 
numerator is therefore of degree < n +n whereas the degree of 


the denominator equals n + n. Hence, the sum is a proper fraction. 
Obviously, the same is true for any number of summands, 

Let (36) be a proper fraction. Factor the denominator in linear 
factors and combine similar factors [see formula (27)]. Then we can 
prove that the fraction can be represented in the form 


Qitey 5 Ags a a APAL ie a 
P(e) Ga“ tear a Gage 
By Bas Dı 
Ta gett tat taa 
D 
Pet... H (37) 
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where A;, Ag, .. +; Da, are certain numerical coefficients. The 
rational fractions appearing on the right-hand side are called partial 
rational fractions of the first type. Thus, every proper rational frac- 


tion may be represented as a sum of partial rational fractions of 
the first type. 


To prove the assertion we transform the fraction as follows: 
Q(z) _ Q (2) 


R  (a—24)%4 (2 — 29)... (e—a) È os 


QO e—2) ——)] È 


ao (1—21)! (g— 2) ... (€— zr) P (2—1) 


= Q (2) a 
ag (22.— 24) (2—24)%- (2—2) ... le— zp) ® 
E Q(z) 
ag (22—24) (g—24)™ (2—2,)%2 1 Seem (z — zp) È 


The constant factor Z — z, may be combined with the coefficient 
a and thus the number of linear factors in the denominators of the 
two resulting fractions is by one less than that of the original frac- 
tion. Repeating transformations of this kind for each fraction we 
again reduce the number of linear factors in the denominators by 
one and so forth. We can proceed in this way as long as there are 
at least two different factors in the denominators. After a certain 
number of transformations there will no longer be different linear 
factors in the denominators, that is we shall arrive at a sum of frac- 
tions of the form —2@)_ 

a (z2—z))*" 

if we expand the numerator into powers of z — z; (see the begin- 

ning of Sec. IV.15) we obtain 


c d 
PERAE a A e A a a 
a (z—z;)* a (e—2)% ~ @—2)™ Gone aa 


Here after the division is performed there may be an entire part 
(that is an entire rational function, a polynomial). If now we add 
together all the fractions thus obtained we shall receive final for- 
mula (87), and all the entire parts must mutually cancel because 
otherwise a proper fraction would appear in the form of a sum of 
a proper fraction and a polynomial, which is impossible. (Why is 
it impossible?) 

Practically, the decomposition is usually carried out by means 
of the method of undetermined coefficients. For this purpose we 
write the right-hand side of formula (37) with literal coefficients 
in the numerators and then find the coefflcients. There are different 
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techniques for calculating the coefficients. We shall demonstrate 
them by considering an example. 
Let it be required to decompose the fraction 


z3 —22+3 
z («—1) (w+ 2)? 


into partial fractions. Since the fraction is proper we write, by 
formula (37), 


(38) 


age gee Ae B g ADNA 
re-ie S Gao T (39) 


where the coefficients, for simplicity’s sake, are all denoted by diffe- 

rent letters. Multiplying by the common denominator we receive 
£ — 22 +3 = A (z — 1) (x + 2)? +- Bz (x + 2)? + 

+ Cx (z — 1) + Dz (x — 1) (£ + 2) (40) 

The equality must be an identity. We can therefore remove the 


parentheses and equate the coefficients of the same degrees of zv. 
This yields the system of equations (verify it!) of the form 


De aan os Dd 
jal Baa EC 4-D>=0 
z IB i) a) (41) 
EA a 


from which it is easy to find 


ee a 15528 (42) 


This method is the most reliable [equating the coefficients we 
are sure that relation (40) is an identity and therefore (39) is also 
an identity] but not the simplest. There is another method which 
we can illustrate by taking (39) as an example. Without removing 
the parentheses in equality (40) we simply make z assume four 
different values according to the number of unknown coefficients 
to obtain equations for determining A, B, C and D. In our example 
it is very convenient to put z = 0, 4 and —2 since these values 
eliminate some summands and, in addition, to equate x to any 
arbitrary value, for example, to —1. This results in the relations 


4a OBER 6C = 1; 2A — B+ 
+ 2C+2B=4 (43) 
which imply the same values (42), 
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The last method is the collocation method, that is the method of 
equating two expressions for different values of the argument. Such 
a method is especially effective in case the denominator of the fraction 
that should be decomposed has no multiple roots, that is all the 
linear fractions entering in its factorization have the first power. 

If fraction (36) has real coefficients and if the independent vari- 
able is considered real the denominator can nevertheless have ima- 
ginary roots. Then though decomposition (37) is possible it may 
sometimes be inconvenient. In such cases another decomposition 
is often used. Namely, departing from decomposition (37) and using 
formula (29) we can prove the validity of the following decompo- 
sition: 


OG D Qi) 2 
PO) ag(2—a4)™ ... (@— ay) (22+ pix qa)... (22+ pga + ga)” 
ate tte e tt 
+p tt ey 
taper tae. 4 ee 
TOEN TEA “ 


(22+ psz + =) P9- 

The fractions on the right-hand ‘side which have denominators 
equal to powers of quadratic trinomials are called the partial rational 
fractions of the second type. As before, all the coefficients entering 
into the numerators can be found by the method of undetermined 
coefficients. But since all the operations are performed now only 
on real numbers the unknown coefficients which should be found 
from a system of equations of the first degree [obtained by analogy 
with system (41) or (43)] must necessarily be real. 

Thus, every proper rational fraction with real coefficients can be 
represented in the form of a sum of partial rational fractions of the 
first and of the second types with real coefficients. It may happen 
that there will be only fractions of the first type in case all the roots 
of the denominator are real or only fractions of the second type if 
all the roots are imaginary. 

We shall not give here the proof of the possibility of decomposi- 
tion (44). In every concrete problem the validity of (44) is confirmed 
by the results of the calculations. l 

11. Some General Remarks on Functions of a Complex Variable. 
Each of the functions s 


w=2—iz, w= rrr + W= etc. (45) 
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assumes complex values for complex values of z. In the general 
form a relationship of this type can be written as w = f (z) where 
both z and w are complex. Many notions of the theory of real vari- 
ables can be transferred to the theory of complex functions of com- 
plex variables without any essential changes. This refers to the 
notion of a limit and to the properties of limits (of course, with the 
necessary exception of those properties that are connected with ine- 
quelities), to the notion of the continuity and to that of points of 
discontinuity and so on. The determination of the points of dis- 
continuity of a function is carried out in the same way as for the 
functions of real variables in Sec. III.13. For instance, the first and 
the third of functions (45) are continuous for all z while the second 
one has two points of discontinuity z = +i at which it approaches 
infinity. 

The definition of the derivative of a function of a complex vari- 
able is analogous to the definition of the derivative of a real func- 
tion (see Sec. IV.2): 

Bail 4 = li pure im 
dz f @) ua um 
where the limit must be uniquely determined and independent of 
the process of approaching zero as Az -> 0 which can be quite arbi- 
trary. One can easily verify that all the properties of the derivative 


f (z+ Az)—f (2) 
A(z) 


Fig. 188 
The dotted lines represent the mapping: w, = f (zı) etc. 


and all the differentiation formulas established in Secs. IV.4-5 remain 
true without changes but we are not going to treat these questions 
in detail here. The notion of derivatives of higher orders and the 
Taylor formula and series (see Sec. IV.5) that are based on this 
notion also remain true. We shall again discuss the question in 
Sec. XVII.14. 

The geometric interpretation of a function of a complex variable 
w = f (z) involves some essentially new ideas. Since the values 
of z are represented by points in a complex plane of the argument z 
and the corresponding values of w are represented by points in 
the complex plane of the variable w we see that in this case there is 
a certain correspondence between the points of the z-plane and the 
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points of the w-plane. We can also say that the points of the z-plane 
are mapped on the points of the w-plane. Or, in other words, the 
z-plane or its part where the function w = f (z) is defined is mapped 
into the w-plane. For example, the function w = zè — iz performs 
a mapping under which the points z = 0, z = 1, z = i and z= 
= 2 — i are mapped on the points w = 0, w = 1 — i, w = 1 —i 
and w = 1 — 13i, respectively, and so on. (Check it up!) If the 
point z moves in the z-plane and traces a curve then w also describes 
a certain curve in the w-plane (see Fig. 188). Thus, curves are trans- 
formed into curves under the mapping w = f (z) and geometric figures 
in the z-plane are mapped on the geometric figures in the w-plane 
although the form of a geometric figure may be considerably changed 
by the mapping. 


CHAPTER IX 


Re ee 


Functions 
of Several Variables 


§ 1. Functions of Two Variables 


1. Methods of Representing. The concept of a function of any num- 
ber of independent variables and the corresponding notation were 
introduced in Sec. 1.44 and Sec. 1.12. We have already used the 
concept but it is necessary to discuss the ways of representing such 
functions in greater detail. 

The method of analytical representation of a function z = f (x, y) 
depending on two variables does not essentially differ from the one 
applied to functions of one variable whereas the tabular method 
becomes much more complicated in this case because now we have 
to represent the values of two independent variables and it is there- 
fore necessary to use a table with two entries. Such a table may have 
the following form: 

TWO-ENTRY TABLE 


z=f (x, y) 


Y | 
vA y2 us aah Yy 
x 


z= f (a1, ys) | +++ |an = f (ti YN) 


ay | za=f (ay ys) | a25f (tr Y2) 


z, | za=f (£z ys) | 22=f @2 yo) | zz3=f (tay ya) | -e |Z2Nn =f (£2, YN) 


ay \tay=f (em vm = Í Fm Yazma =f (Œm Y3)| +++ |zmn=f (2m YN) 


Here we have to denote the values of the function by two indices. 
Obviously, it is difficult to compile such a table if we have a great 
number of values for x and y. 
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In compiling a table one can also take into account that if we 
fix a certain value of one of the variables the dependent variable z 
becomes a function of only one variable. We can therefore obtain 
a system of one-entry tables but this is, of course, equivalent to 
a two-entry table. For instance, such a system may have the follow- 
ing form: 

ONE-ENTRY TABLE 
cry 


y | y | Y2 | aay YN 


z z1 =f (t1, ys) Z12=f (ti, y2) | oa zin=f (t1, YN) 
@ = ay 
y | y | Yo | tee YN 


Zon =f (£2, YN) 


Ze1= f (2, y1) Z22= Í (£o, Y2) | 


The same principle of fixing certain values of one of the variables 
can be used for the graphical representation of a function of two 
variables. This results in a system of graphs. Such a system may 
have the form shown in Fig. 189. 

In theoretical investigations we also encounter one more method 
of graphical representation of a function z = f (x, y). Let us take 
Cartesian coordinates z, y and z in space (we can also use other 
coordinate systems which will be introduced in Sec. X.1). Making 
the independent variables assume certain numerical values £ = EZ 
and y = y, we obtain the point N, (see Fig. 190) lying in the plane 
of the arguments x and y (that is the z, y-plane which is denoted as 
xOy). After calculating the corresponding value z; = f (x,, y,) of 
the function we can construct the point M, in space. Taking some 
other values of the independent variables we construct the point Ma 
etc. If now we regard (theoretically) the independent variables as 
taking on all its possible values the points of type N cover either 
the whole plane or some part of the plane. Each of the points N 
generates a corresponding point M lying above or under N depending 
on the sign of the value of the function. Therefore. all the points M 
cover a surface (S) which is nothing but the “graph” of the function 

in question. 
We shall use the above method for theoretical purposes to visualize 
the character of the behaviour of the function but the practical 
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significance of the method is limited by the difficulty of drawing 
a surface in space. : 

Now let us consider a method which is widely used in practical 
problems. Making z take on some constant values hy, hy, hs, ... 


z=f(x,y) 


Heyy) 
Zg=f (3, Y2) 


Fig. 189 Fig. 190 


we obtain the equations f (z, y) = hı, f (x, y) = ha ... which 
describe the corresponding curves lying in the z, y-plane. These 
curves are called the level lines of the function f. We can obtain 
the lines geometrically (see Fig. 191) by taking the curves of inter- 


section of the surface z = 
= 
ARATE 


are parallel to the z, y-plane) 
and projecting the curves on 
the plane 2Oy. In particular, 
this method is widely used in 
drawing geographical maps. 
In this case the function re- 
presents the elevation above 
sea level. For instance, the 
system of level lines may 
have the form represented in Fig. 191 
Fig. 192. The bergstrichs indi- 
cate here the directions in 
which the function decreases (in the case of a geographical map 
they show the direction of water run-off). In Fig. 192 we see that 
the graph has “peaks” at the points A and B (the peak at the 
point A is higher than at the point B) and a “valley” at the 
point C etc. 

There is a special branch of mathematics called nomography 
(the name is originated from the Greek words “vopoo” law and 


= f(z, y) with the planes 

z= hy, 2 =k grees eign 
Lo ee 
Vig hizi H3 
li 
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“voaetv” to write) which deals with methods of constructing nomo- 
grams, that is special drawings that are helpful for representing 
functions of any number of independent variables in a practically 


Fig. 192 


convenient way. The application of nomograms saves much time 
and effort in calculations and does not require any special quali- 
fication. It should therefore be widely recommended. There are 
many different types of nomo- 
grams. As an example we repre- 
sent a nomogram in Fig. 193 
which is designed for calculating 
one of the setting angles a, of 
a cutter in a cutter grinder for 
given tool angles œ and ọ. The 
values of œ, ọ and a, are marked 
on the three corresponding axes 
(two of them are curvilinear). If 
we apply a ruler to certain po- 
ints æ and @ on the correspon- 
ding axes we can read the desi- 
red value of a, on the third axis. 
For instance, in Fig. 193 we have 
Fig. 193 the values æ = 10° and @ = 30° 
which yield œ, = 19.5°. 

2. Domain of Definition. The domain of definition of a function 
z = f (x, y) is the range of the independent variables z and y. If 
the independent variables are continuous the domain is either the 
whole z, y-plane or a region in the plane, or, finally, a totality 
of a number of regions in the plane. When we speak about a region 
in the z, y-plane we usually mean a connected (simply-connected) 
set of points, that is a set consisting of one entire part of the plane, 
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which does not degenerate (this means that it is not a line or a point 
or a number of separate points). We sometimes distinguish between 
closed regions (domains) which include their boundaries and open 
ones which do not include the boundaries. In other words, such 
regions in a plane play the same role as intervals on a straight line 
(see Sec. I.5). 

For example, the domain of definition of the function z = z + y 


is the whole x, y-plane. The domain of the function z = Va 


J 


Fig. 194 


(in case we consider only real values of z) is defined by the imemuntity 


y — 2 > 0; eqyee as The domain of the function z = Vinny 


is obtained from the inequality 2? + y? <1 etc. These domains 


are shown in Fig. 194. f 
3. Linear Function. According to Sec. 1.17, a linear function 


of two variables has the form 
z =ar + by+c (1) 


where a, b and c are constant coefficients. By analogy with Sec. 1.22 
we can easily derive a formula for an increment of the function: 


Az = aAx + bây 


Similar formulas hold for linear functions of any number of vari- 
ables. 

Formula (4) having three coefficients, any linear approximation 
(i.e. an approximate replacement of a function by a linear function) 
requires three conditions. For instance, let the values of a function 


f (z, y) be known: 
fla, ys) =4 Í E» Ya) = Za f (£a Ys) = 2s 


If we want to construct a linear function (1) which takes on the 
same values (i.e. to carry out the linear interpolation) by substi- 
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tuting (approximately) an expression of type (1) for f we must have 


azı + by, +e = % 
aX, + bya + ¢ = Z: (2) 
at, + bys + € = 23 


We determine the coefficients by solving this system. Such a re- 
placement of f by (1) yields good results if, in the first place, we 
consider the values of the arguments lying inside the triangle with 
the vertices (a,, y:), (ta, Y2) and (£s, ys) (see Fig. 195) and, in the 

second place, if the triangle is not large 
enough for non-linear properties of the 

y function f to be manifested in a noti- 
ceable manner. Besides, the triangle must 
not have very acute angles. (If in a cer- 
tain limiting process one of the angles 
vanishes and the triangle turns into 
a line segment the determinant of system 
(2) also vanishes and our calculations 
become inapplicable.) 

If we replace f by (1) outside the trian- 
gle (this is the linear extrapolation) then, 
in general, the error will increase as we 
move away from the triangle. 

Fi We can likewise carry out linear 
g. 195 3 R fi 
interpolations for functions of any num- 
ber of arguments. 

4. Continuity and Discontinuity. The concept of continuity of 
a function z = f (x, y) is quite similar to that of a function of one 
argument which was discussed in Sec. 1.146 and Sec. III.12. As an 
example we can formulate the following definition of continuity: 
a function f is called a continuous function for the values x = £o 
and y = Yo of the arguments if for every process in which £ — £o 
and y— Yo (in an arbitrary way) we have f (x, y)—> f (xo, Yo): 
If otherwise the function is said to be discontinuous for these values 
of the arguments. Then the point with the coordinates (£o, yo) 
lying in the x, y-plane is called the point of discontinuity of the 
function. A function which is continuous at each point of a region 
is said to be continuous in the region. 

It should be noted that besides separate points of discontinuity 
a function of two variables may have entire lines of discontinuity, 
that is lines wholly consisting of points of discontinuity. For instance, 
let us take the functions 


1 


and Sek ee 


(Nad 
~ agy? 


sito 
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The first function has only one point of discontinuity (0, 0) whereas 
the second function has an entire line of discontinuity, namely, 
the straight line (y — z)? = 0, i.e. y = x. The level lines of the 
functions are depicted in Fig. 196. In both cases the functions 
approach infinity at the points of discontinuity. But, as in the case 
of a function of one variable, there are other types of discontinuities. 
In practical problems we often encounter such a line of discontinuity 
of a function that in approaching any point of this line from one 
side the function has a certain finite limit whereas in approaching 


y 


JI 


z=co on this 


I line 


z=ccat this 
point 


Fig. 196 


the same point from the other side of the line the function has a 
different finite limit. In such a case the function has a finite jump 
as the point (z, y) passes through the line. An approximate sketch 
of the graph of such a function is depicted in Fig. 197. 

The behaviour of a function of two variables in the vicinity of its 
point of discontinuity may essentially depend on the way the point 
is approached. For instance, there may exist a limit depending 
on the choice of a path of approaching the point for one group of 
paths and there may exist neither a finite nor an infinite limit for 
other paths etc. Since there is an infinitude of ways of approaching 
a point of discontinuity (whereas we have only two main ways of 
approaching a point of discontinuity in the case of a function of 
one argument, namely, from the right or from the left side) points 
of discontinuity of functions of several variables are, in general. 
ofa more complicated type than those of functions of one independent 
variable. For example, the function z = a has its only point 
of discontinuity at the point x = 0, y = 0 where the denominator 
vanishes. Now if z — 0 and y— 0 in such a way that + =k, i.e. 


F . (ARE Be 
y = ka, where k is a constant we obtain z = apes i 


19-0141 
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and therefore the limit depends on the relation between y and z 
(see Fig. 198). If in approaching a point there is no single finite 
or infinite limit of a function then calculating the limit at the point 
we must specify the way of approaching it to avoid misunderstandings. 
The properties of continuous functions of two arguments defined 
in a finite closed region in the z, y-plane are analogous to those 
described in Sec. [11.14 for 
2 functions of one argument 
defined over a closed finite 
interval. We are therefore 
not going to enumerate the 
properties again. 


Fig. 197 Fig. 198 
L) is the line of discontinuity of the function f; In approaching the origin in the 
it lies in the plane xOy direction a the limit of z is equal 


to 1; the limit is equal to zero in 

the direction b and it is equal 

to —1 in the direction c. There is 

no limit in approaching the origin 
along the spiral d 


We sometimes encounter the problem of solving an inequality 
of the form f (z, y) >0. This can be carried out in a way similar 
to the method of solving inequalities discussed in Sec. I1I.15 [see 
inequality (II1.17)]. We should first draw the curve f(x, y) =9 
and the lines of discontinuity of the function f provided there are 
such lines. All these curves break the plane into parts, and the 
function retains its sign inside each of the parts. We can determine 
the corresponding signs by calculating the values of the function 
for the points arbitrarily chosen in each of the parts. 

For instance, let us solve the inequality 


x2 y2—4 
z+y 


Here the circle z? + y? — 4 = 0 is the line of zeros and the 
straight line z + y = 0 serves as the line of discontinuity. They 
divide the plane into four parts shown in Fig. 199. Now we take 
a point in each of the parts, for example, the points (—3, 0), (—1, 0), 


>0 (3) 
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(4, 0) and (3, 0), and determine the corresponding signs of the 
function which are —, +, — and +. The regions where inequality 
(3) holds are shaded in Fig. 199. 


5. Implicit Functions. The defini- YG 
tion of an implicit function of two rr 
arguments is similar to that in pr ia pr, 


Sec. 1.20 given for a function of 
one independent variable. An impli- 


ngeia aine Ao p 


z (xz, y) may turn out to be mul- 
tiple-valued and then we can con- Yl 
sider its single-valued branches. 
Equation (4) may define a sur- 
face of an arbitrary form whereas 
a surface determined by an equation z = f (zx, y) is punctured by 
any straight line parallel to the z-axis at not more than one point 
(see Fig. 190). 


A 
© 


Fig. 199 


§ 2. Functions of Arbitrary Number of Variables 


6. Methods of Representing. The main notions related to the 
analytical form of a function and to its properties are transferred 
to functions of any number of arguments. But there are some addi- 
tional difficulties in investigating such functions. First of all, the 
tabular and the graphical ways of their representation become too 
complicated: We can, of course, represent a function of three variab- 
les by means of a system of two-entry tables (see Sec. 1) or by a set 
of pictures similar to Figs. 189 or 192, but this is very difficult. 

But in certain cases the calculation of the values of a function 
of a large number of arguments may sometimes be reduced to the 
calculation of the values of several functions of a lower number of 
variables. Then we can widely use the methods described in Sec. 1.43 
and Sec. 1. For instance, take a function of four arguments of the 
form u = f (z, y) + @ (z, t). To calculate the values of u we need 
tables and graphs of the functions f and ọ. But each of the last two 
functions depends only upon two variables and this facilitates the 
calculations. Similarly, the calculation of the values of the function 

=flp@t+y.¥ (z) — íl depending on four independent vari- 
ables requires the representation of one function of two variables 
and of two functions of one variable and so on. In such cases the 
calculation and the investigation of functions become much easier. 
Unfortunately, not all funetions can be represented in this way. 


19* 
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7. Functions of Three Arguments. Another difficulty lies in the 
geometrical interpretation of a functional relationship in case the 
whole number of variables (both dependent and independent) is 
greater than 3, that is greater than the dimension of our usual geo- 
metrical space. The situation is comparatively simpler for a function 
of three arguments u = f (a, y, z) (the total number of variables 
equals 4 here). In this case the domain of definition is either the whole 
space of the variables z, y and z or some part of it, that is one or 
several regions (domains) in the z, y, z-space (see Sec. 2; the notion 
of a degenerated region should be appropriately changed in this 
case). We can therefore represent such a domain geometrically. 
For instance, the domain of definition of the function u = z? + 
-+ y? — z is the whole space whereas the function u = yV i= rya 
is defined only if 

4—72—y—25>0 o P Hyl 
i.e. in the last case the domain is the sphere of radius 4 with centre 
at the origin of coordinates. 

We can also consider level surfaces of a function f (x, y, z) under- 
standing them in the sense of Sec. 1, that is as surfaces in the x, y, Z- 
space on which the function is constant: f (x, y, z) = const. 

Points of discontinuity (provided a function has them) lie in 
the z, y, z-space and can also be represented in a visual way. The 
points of discontinuity of a function of three independent variables 
can be located separately but they can also form lines of disconti- 
nuity and even surfaces of discontinuity, that is surfaces which enti- 
rely consist of points of discontinuity. For example, when investi- 
gating a passage from one physical medium into another we inter- 
pret the interface as a surface on which the quantities characterizing 
the properties of the media have discontinuities (for instance, this is 
applicable to investigating the passage of the light from water into 
air or from glass into air etc.). 

8. General Case. The concept of a space of variables is quite 
visual and it is therefore convenient to transfer it to the case of 
functions of any number of independent variables. This leads to 
the notion of a many-dimensional Cartesian space (see Sec. VII.18, 
example 3). For instance, let a function w = f (x, y, z, u) of four 
arguments be considered. Then every quadruple of values x, y, 2 
and u determines a point in the space Æ, (strictly speaking, such 
a quadruple is, by definition, a point of #,). Thus the space Æ, 
serves as the space of arguments here; the domain of definition of 
the function w is a region (domain) in the space Æ, or a set of a num- 
ber of such regions. The function may be continuous or may have 
separate points of discontinuity, lines of discontinuity, two-dimen- 
sional surfaces or three-dimensional hypersurfaces (see Sec. VII.19) 
consisting of points of discontinuity. 
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To construct the “graph” of a function u = f (£, y, 2, t) we need 
the five-dimensional space Æ, of the variables z, y, 2, ¢ and u. To 
find the points of the graph we must make z, y, 2 and ¢ assume 
arbitrary values and calculate the corresponding values of u. [For 
instance, we can easily verify that the graph of the function u = 
= xz — 2y*t passes through the points (1, 1, 2, 0, 2) and (—1, 2, 
0, —2, 16) etc.] When we discussed functions of three variables 
in Sec. 7 we saw that the space of the arguments had a visual geo- 
metrical interpretation. But this is not so for its graph that lies 
in the four-dimensional space Æ, which we cannot visualize. 

We shall see in Sec. X.2 that a many-dimensional space can be 
interpreted in a less formal way than that connected with the notion 
of a Cartesian space of n-tuples consisting of n numbers. 

9. Concept of Field. We say that there is a field of a quantity: 
in space if a certain value of the quantity is defined at each point 
of the space. For instance, when investigating a flow of gas we con- 
sider the field of temperature (the temperature has a certain value 
at each point), the field of density, the field of velocities and so on. 
A field can be a sealar field or a vector field depending on the pro- 
perties of the quantity in question. For example, a temperature 
field and a density field are scalar ones whereas a velocity field or 
a field of force is a vector field. A field is called stationary (steady- 
state) if it does not change at each point of the space as time passes 
and it is called non-stationary if such a change takes place. 

For definiteness, let us denote a scalar quantity by the letter u 
and an arbitrary (variable) point in space by the letter M. Then 
to each position of the point M there corresponds a certain value 
of the quantity u and we can therefore regard u as a function of M: 
u = f (M). Such a function differs from those considered above 
since a point is not a quantity. But in the general sense of the notion 
of a function (widely used in modern mathematics) we can apply 
the term “function” to every situation when there exists a certain 
law according to which to the objects of one “kind” (of an arbitrary 
nature) there correspond the objects of some other “kind” (in our 
case the objects of the first kind are points in space and the objects 
of the second kind which correspond to the points are the values 
of the quantity u). In case a field is non-stationary we have u = 
= f (M, 2) where ¢ is the time. 

Now we can easily pass from a function of a point to a function 
of three variables, namely, of three spatial coordinates. To do this 
it is sufficient to introduce a Cartesian coordinate system z, y, 2 
in space. Then the position of a point M in space is completely 
characterized by the corresponding values z, y and z, that is we 
can write u = u (z, y, 2). Conversely, if a coordinate system «x, y, Z 
is given then any function of x, y and z can be regarded asa function 
of a point. But we should take into account that a field u = f (M) 
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(i.e. a function u of the point M) has its own meaning and can be 
investigated without introducing any coordinate system. Besides, 
introducing coordinate systems in different ways we obtain different 
relationships of the form u = u (z, y, z) for one and the same 
function u = f (M). Hence, when investigating a field we regard 
the concept of a function of a point as a primary concept relative 
to the concept of a function of the coordinates of a point. 

If a quantity, according to its physical or geometrical meaning, 
depends on the position of a point in a plane we call the corres- 
ponding field a plane field. We encounter such fields when investi- 
gating thermal processes in a thin plate whose thickness can be 
neglected. 

If we have a space field of a quantity u which depends only on x 
and y and does not depend on z in a certain coordinate system z, y, Z 
the field is said to be a plane-parallel field. Then we can regard such 
a field as being defined in the z, y-plane (and as independent of z) 
which means that the field can be treated as a plane field. But of 
course we should keep in mind that in reality the field is spatial 
and that the relationship between u and the coordinates is the same 
in all the planes parallel to the z, y-plane. 


$ 3. Partial Derivatives and Differentials 
of the First Order 


10. Basic Definitions. Let a function of several independent 
variables be given. For definiteness, let it be a function of three 
arguments u = f (x, y, z). If we fix certain values of all the argu- 
ments but one the variable u becomes a function of this single argu- 
ment. We can therefore differentiate the function with respect to 
the argument and take its differential as it was done in Chapter IV 
for a function of one independent variable. Such derivatives are 
called partial derivatives. The corresponding differentials are partial 
differentials of the function. In other words, 


r Axu 
ux = f; = x 
x= fx (£, y, 2) o ES 
' Gs Ayu 
uy = fy (z, y, z)= lim A 
Ay-+0 OY 


where Asu = f (x + Az, y, 2) — f (z, y, z) and Ayu = f (z, y + 
+ Ay, z) — f (x, y, 2) are the partial increments of the function. 
A partial increment corresponds to a change of one of the variables 
when all the other variables are kept constant. Let the reader write 
the expression of the increment A,u and of the derivative u;. One 
should take into account that the symbol wu’ or f’ (x, y, z) has no 
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sense for a function of several variables because one must necessarily 
indicate the variable with respect to which the derivative is taken. 

The computation of partial derivatives of concrete elementary 
functions is performed according to the rules given in Sec. IV.5. When 
we differentiate with respect to a certain variable we must regard 
all the other variables as constants. 

Example. Let u = z°z — y* then uy = 222°, uy = —zy*-! and 
ul = 32°22 — y*Iny (check up these results!). 

A partial differential is denoted by the symbol @ with a subscript 
indicating the argument with respect to which the differentiation 
is performed. Thus, 


ôu = uk Az, Ôu = uy Ay and ĝu = u, Az 
In particular, if we put u = z here we obtain 


dn = Ax and 02 = ĝm = 0 (5) 
since z% = 1 and 2, = z; = 0. We have therefore ĝxu = uy 0,2 
which implies u = . One usually omits the subscripts in 


n 3 TLLA 
putting down the last formula and simply writes u = = because 


the denominator itself indicates that the derivative is taken with 
respect to x (or with respect to some other variable in other cases). 
z. This rule of writing differentials 
sometimes becomes inconvenient. Indeed, in the first place, not 
only denominators differ in the last two expressions but the nume- 
rators as well. Actually, the numerator of the first fraction must be 


regarded as 0,u whereas the numerator of the second fraction as 0,u. 


Oz 


In the second place, for example, we cannot write = instead of 


(=) Wi since the differentials in these expressions have a different 
T 


sense (in the first expression the differentials are taken with respect 
to u whereas in the second one with respect to z). 

When using partial derivatives one must be careful and pay 
much attention to the choice of independent variables. For instance, 
if we write the expression for the power of an electric current in 


the form P = 2 where U is the voltage and R is the resistance of 


R 
$ TAR: U? P * 
the circuit we obtain a =—F-—Re But if the same formula 


is written in the form P = J*R- where I is the flow of the electric 
current then we get R =P= 3 . There is no contradiction bet- 
ween these results. Actually, if we write them in full we shall have 


oP P fi ôP 
as —Ż for the first result and =, 
ôR | U=const R oR 


Similarly, uy = = and u; = 


P 
ener R for the 
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second one. (Let the reader think about the physical meaning of 
the signs + and — entering into the last formulas.) 

41. Total Differential. The total (or exact) differential of a function 
u =f (x, y, 2) is equal to the sum of all its partial differentials: 


du = ĝyu + ĝyu + 0,u = u;Azx + uyAy + uz Âz = 


ô ô ð A 
= 5, N e 7 Ay +3, As (6) 


Formula (6) must be regarded as the definition of a total differential. 

In particular, if we put u = x formula (5) implies that 
dz = 0,£ + O,x + 0,4 = Ax 

and hence the total differential of an independent variable is equal 


to the increment of the variable (compare with Sec. IV.8). Formu- 
la (6) can therefore be rewritten as 


oe de + SH dy + E az (7) 


eat A Fate 
du = uy dx + uy dy + u; dz FE ay 


For example, 
E ENIE Ay a2 i3 z3 EA 
d(z sin) - (2esin 5+ s cos=) dz a cos 7 
The connection between the total differential of a function and 
the total increment of the function is analogous to the one described 
in Sec. IV.8. Let the independent variables receive increments Az, 
Ay and Az. Then u receives the increment 
Au = f (z + Az, y + Ay, z + Az) — f (z, y, 2) 
This is the total increment. It can be represented in the form of a 
sum of three increments so that each increment corresponds to the 
change of one of the variables. Namely, i 
Au = F (x + Az, y, 2) —f (z, y, 21+ 
+ [f (z + Az, y + Ay, 2) — f (xz + Az, Y, z)| + 
+ If (z + Az, y + Ay, z + Az) — f (£z + Az, y + Ay, z)) (8) 
The reader must carefully check up this formula! The increments 
in the square brackets entering into formula (8) are nothing but 
partial increments and are therefore connected with the partial 
differentials in the way described by formula (IV.23). Hence, 
f (« + Az, y, 2) — f (x, y, 2) = fx (a, y, 2) Az + a Az, 
f(z + Az, y + Ay, 2)'— f ( + Az, y, z) = 
= fy (x + Az, y, z) Ay + a, Ay = 
= [fy (z, y, 2) + as] Ay + a Ay = 
= fy (z, y, 2) Ay + a Ay, 
f(a + Az, y + Ay, 2 + Az) — f (« + Az, y + Ay, 2) = 
= te (z, yY: z) Az a as Az 
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(the last formula is deduced in like manner) where all the quantities 
denoted by æ tend to zero as Az —> 0, Ay > 0 and Az— 0. Sub- 
stituting these expressions into (8) we obtain 
Au = us At + ul, Ay + u Az + a Az + B Ay + y Az = 
= du + a Az + B Ay + y Az (9) 
where a, p and y— 0 as Az, Ay and Az — 0. Consequently, in the 
general case we can also say that the total differential is the principal 


linear part of the increment of a function. We call it linear because 
it equals the sum of summands which are directly proportional 


y 
+u , , 
ytay ih Ti nena o Mota aaa al 


Fig. 200 


to, the increments of the arguments and it is regarded as the prin- 
cipal part of the total increment since it differs from the increment 
by an infinitesimal variable of higher order with respect to the in- 
crements of the arguments (compare with Sec. 3). As in the case 
of a function of one independent variable, the replacement of an 
increment of a function by its differential is equivalent to the repla- 
cement of a non-linear function by a linear one. : 

Formula (9) is illustrated in Fig. 200 for the case of two indepen- 
dent variables. The values of the function (with the infinitesimals 
of higher order neglected) are put down near the corresponding 
points of the z, y-plane. 

As in Sec. I[V.10, the approximate relations A,u © 0,u, Ayu X 
X Oyu, AU Oz and, particularly, Au ~ du are the source of 
many useful concrete approximate formulas. We note here that 
the last formula can be rewritten in full as 


f (a +h, b+ k, c+) 7a f la, b, c) + fala, b, c) h + 
+f, (a, b, c) k + f(a, b, e)l 
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It is also easy to deduce the formula ; 
au =| fs (2, Y, 2) leat fy (T, Y, Z) l&u +f- (T, y, 2) | 
which is analogous to formula (IV.29) and is obtained in a similar 


way. 
For instance, let u = zy. Then 


au=|y |as + |T] oy, 


ee pelea letey ce Ls ey abe Op 
lu] lzy | lz} lvl 

that is the operation of multiplication (and, similarly, the operation 
of division) of approximate numbers yields the addition of their 
maximum relative errors (let the reader check up this rule for the 
case of division!). 

12. Derivative of Composite Function. Let again u = f (x, y, 2) 
and let the variables no longer be independent but in their turn 
depend on some independent variables s and ¢. Thus we have 


w=u(e, y, 2), c=2(s,2), y=yl(s, ), z=2(s, 2) (10) 


We see that u becomes a composite function of s and ¢. To calculate 
the partial derivative u; let us fix t and make s receive an increment 
As. Then x, y and z will also receive certain partial increments and 


therefore u also gains an increment which, according to formula (9), 
can be written in the form 


Asu = uz Ast + uy Ay + uz Ags + a Ast + pAy+y As 


If now we divide both sides by As and pass to the limit as As — 0 
we shall obtain 


, ô Ki TET. toe Z 
Wa GS Ute + uy u e EH g Oe z (11) 
The derivative wu; is expressed similarly. Thus we see that the rule 
we have deduced is analogous to that of differentiating a function 
of one independent variable [see formula (IV.9)] but the number of 
summands is greater here since the derivatives with respect to all 
intermediate variables also enter into the differentiation formula. 
Formula (11) implies [this is analogous to the results obtained 
in Sec. IV.9] that formula (7) [but not formula (6)!] remains true 
even in the case when the former independent variables turn out 


to be dependent on some other variables. In fact, in the case of for- 
mula (10) we have 


du = u, ds + uj dt = (usa, + upy, + uiz!) ds + 
+ (use + uy’, + uzi) dt = ut, (x; ds + 2% di) + 
+ uy (ys ds + yt dt) + u, (zs ds + 2; dt) = us dx + ul, dy + u, dz 
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which is what we set out to prove. Hence, formula (7) is invariant, 
that is holds for all cases [just as it is for formula (IV.22)]. 

The invariance of the form of the total differential of a function 
implies many differentiation formulas. For instance, if w = Ww 
where u and v may depend on some other variables then formula (7) 
yields dw = wy du + w, dv = v du + u dv. Consequently, the for- 
mula d (uv) = v du + u dv holds for all cases. In a similar way 
we can verify the validity of the following formulas: 


d(u+v)=du+ dv, d(Cu)=Cdu (C = const), 


d (=) a2 , d(u")=nu"4+du, d(sinu)=cosudu etc. 

In many cases these formulas enable us to calculate a total diffe- 
rential directly, without computing the corresponding partial 
derivatives. 

For example, d (sin z?y?) = cos a?y3d (x?y3) = cos (xy?) [yd (a?) + 
+ ad (y3)] = cos 27y? (2xy° dx -+ 3z?y? dy). Conversely, the total 
differential of a function being given, we can restore the partial 
derivatives determining them by taking the coefficients in dx and dy. 

Here we give several examples on calculating derivatives. 


(1) Let u= f(V EF). Then 
=f VEER VEFA- VPF) yare 


z P Sean 
iar we ih (VEF?) 


Here the function f itself is a function of one variable. This variable 
is replaced by V2 + y’; the symbol f’ designates the derivative 
of f with respect to its single argument. 


(2) Let u=f(=.+.9)- Then 
wie fe oa) (toe (oa) Ste GE) 


x 


Here the function f itself is a function of three variables. The expres- 
sions — , 2 and y are substituted, respectively, for these indepen- 


dent variables; the expressions fi, fir and fir designate the deri- 
vatives of f taken with respect to these three variables. 

(3) Let y = z5 *. Then to compute y’ we can take logarithms 
as it was recommended in the end of Sec. IV.5. But we can 
also use another method based on the above results. Let us 
denote y = usin” where u =z and v=x (such intermediate 
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variables are usually introduced mentally without putting them 
down). Now we can differentiate y as a composite function (u and v 
are regarded as intermediate variables): 


Yx = Yule + Ys = Sinv usin?-!.4 + usin? In ucosv-1 = 


=sin ggn- sin xn x cos x 


It is obvious that the last method is more general. If we have 
to compute the derivative with respect to x of an expression into 
which x enters several times we should differentiate with respect 
to each (“imagined”) argument that involves z, multiply by the 
derivative with respect to x and add together the results obtained. 

Here we are only going to consider functions of three variables 
but the following definition will also be useful for our further aims 
since it can be easily transferred to the case of functions of any 
number of independent variables. A function F (z, y, z) is called 
a homogeneous function of degree Æ if for any t >0 we have 


F (tz, ty, tz) = ËF (a, y, 2) (12) 
For instance, the function F (x, y, z) = z? — 3yz is a homogene- 
ous function of degree 2 because 


F (tx, ty, iz) = (tz)? — 3 (ty) (tz) = @ (2? — 3yz) = 
is a homoge- 


=F (xg. y, 2) 
= at al 
zvan (+) 
y—z 
A 


neous function of degree zero, the function ————— is a 
Va-y—2 
homogeneous function of degree -4 whereas, for example, the 
function z + 2y — z + 1 is not a homogeneous one at all. 
In the general case formula (12) can be written as F (ta, tb, tc) = 
= t'F (a, b, c) for any a, b and c. If we differentiate with respect 
to ¢ (and regard a, b and c as constants) we obtain 


F; (ta, tb, te) a + Fy (ta, tb, te) b + F; (ta, tb, te) c = 
= kF (a, b, c) 


Now putting ¢ = 1, a = z, b = y and c = z in the last formula we 
deduce the formula 


zF (£, y, 2) + yF; (z, y, 2) + ZF; (x, y, 2) = kF (x, y, 2) 


which expresses so-called Euler’s theorem on homogeneous functions. 
13. Derivative of Implicit Function. Let an implicit function 
z =Z (zx, y) be defined by an equation 


Peya 0 (13) 


We can similarly verify that the function 
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To calculate the partial derivative z% we must fix y and differentiate 
formula (13) with respect to x taking into account that z also depends 
on x. Thus, applying the rule of differentiating a composite function 
we obtain 


Fiz, + Fiz =0. ie. Fa + Fx =0 (14) 
which implies 
an er Fx (z, Y, 2) (15) 
b Fz (£, Y, 2) 
Similarly, z, = — ty . To guarantee that this derivative assumes a 
z 


finite value we should additionally introduce the requirement 
F; (a, y, 2) #0 (16) 


Formula (46) expresses a sufficient condition for the existence 
of an implicit function 2 = 2 (z, y) defined by equation (13); the 
geometrical meaning of the condition will be illustrated in Sec. XII.3. 

Implicit functions may be defined by a system of equations. Sup- 
pose we have m equations which are compatible (that is they do not 
contradict each other and can be solved, at least theoretically), 
independent (this means that none of the equations is the consequence 
of the others) and connect variables. Then if m < n (i.e. the number 
of equations is less than the number of independent variables) we 
can regard certain n — m variables as independent. We can make 
them take on arbitrary values and express the remaining m variables 
as functions of these independent variables by solving the equations. 
(If the number of equations is equal or exceeds the number of vari- 
ables then, generally, we get some discrete values and it is therefore 
impossible to construct functions.) As an example let us take the 
case of two equations containing five variables, namely, 


f (z, y, 2, w v) a (17) 

@ (£, Ys Z, Us v) =0 
Here we can regard three variables as independent and the other 
two variables as functions of these independent variables. For de- 
finiteness, let us assume that u = u (z, y, zZ) and v = v (a U 2) 
and try to compute the derivatives u% and vy. For this purpose we 
differentiate both equations (47) (y and z are regarded as fixed). 
This yields 


ee ri S a (18) 
Qe 1+ Pulls + Pwx=0 Que + Pwve= — Ox 


Hence, we have a system of two equations of the first degree 
lie. system (18)] containing the two unknown quantities ux and vy. 
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If the determinant of the system is not equal to zero the system can 
be solved (see Sec. VI.4). Let us therefore suppose that 

fu fo 
Qu P 
Solving the equations we can find the sought-for derivatives. 

The derivatives with respect to y and z are found similarly. It 
is important that the determinant of the system for the derivatives 
with respect to y and z is equal to (19) again (only the right-hand 
sides of the system differ from the former ones). Consequently, 
(19) is a sufficient condition for existence of the implicit functions 
u = u (x, y, 2) and v = v (z, y, z) which are defined by system 
of equations (17). In the general case such a condition is derived in 
a similar way and is analogous to (19). 

A functional determinant (i.e. a determinant whose elements 
are the derivatives of some functions) of form (19) is widely encoun- 
tered in mathematics and is called a Jacobian after K. Jacobi (1804- 
1851), a German mathematician. There is a special symbol for 


designating such a determinant: Ju i A The expression 


5 ga | Duv) 
wee should be regarded as an indivisible symbol because at the 


present moment the denominator and the numerator taken separa- 
tely make no sense to us yet. 

Analogous questions arise when we have to solve a system of 
equations containing parameters. For example, let the following 
system of two equations with the two unknown quantities x and 
y be considered: 


#0 (49) 


et AN, 
OAL VNC E A eel 0. 


where a, B, y, . . . are parameters. Suppose that the system has the 


ear Zo, Yo for certain values ao, Bo, yo, ... of the parameters 

and m 9) 0 for these values. Then, by the above reasoning, 

the system defines x and y as functions of æ, B, y, ..., that is the 

system has a uniquely defined solution as the parameters vary and 

take on certain values lying in the vicinity of the values ao, Bo. Yor + + = 

By the way, as it will be shown in Sec. XII.3, the condition 244” 7a 
y) 


-£ 0 guarantees only the local solvability of the system Coe 


the system may not be solvable if the increments of the parameters 
become too large. It should be underlined that such a “stability” 
of a solution with respect to variations of the parameters can be 
guaranteed only if the number m of the equations is equal to the 
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number z of the unknown quantities. If m < n then n — m un- 
known quantities remain arbitrary and if m >n then the solvabili- 
ty of the system implies that there are m — n additional relation- 
ships between the parameters. 


§ 4. Partial Derivatives and Differentials of 
Higher Orders 


14. Definitions. For definiteness, let u = f (z, y, 2) (functions 
of any number of arguments can be investigated similarly). Then, 
as has been shown, we have the three partial derivatives of the first 
order ux = fe (a, y, 2, wy =fy(% V, 2) and u= fs (z, y, 2). 
Each of them can be differentiated repeatedly with respect to z, y 
and z. Hence, we obtain the following nine partial derivatives of 
the second order: 


Uxe = fox (z, Y, 2), Uxy = fry (x, Y: 2), Ux, = faz (z, Y, 2), 
Uyx = fux (z, Y, 2), Uy = fw (z, Y, 2), Uyz a fox (x, Y, 2), 
Wie = fzx (x, Y; 2), uy = tee (x, Y; 2) and uz, = fz (z, yY, 2) 


The differentiation of elementary functions represented explicitly 
is performed in accordance with the rules given in Sec. IV.5. The 
differentiation of implicit functions is achieved by the repeated 
differentiation of equalities of type (14), (45) or (48) and the like. 
Derivatives of orders higher than the second are defined analogously. 

Partial differentials of higher orders are defined in a similar manner 
and, just as it was done in Sec. IV.12 and Ses. 10, we arrive at 
the equalities 

OU = Uxx dx", yu = Ux, dx dy etc. (20) 


where the differential of the independent variable z is understood 
as dr = Ax = xz and so on. From this we deduce 


yet Pu nce Ont 


ý xu Pu 


Uxx = (x2) = 922? Uxy = (0x2) (yy) aa ax dy’ Uxz = Sonos: etc. 


The notion of a partial difference of a function of several variables 
can be defined in a way similar to the one used in Sec. V.7. But in 
this case we must indicate the variable with respect to which the 
difference is taken. The differences can be taken with different steps 
for different variables. For example, let z = f (a, y); then we can 
designate the step along the z-axis by h and use the symbol Ap for 
denoting the partial difference with respect to x: Anz = f (£ + h, y= 
ERE Similarly, let k designate the step along the y-axis and 
let A, be the symbol for the partial difference corresponding to y: 
Aw =f, yT f (z, y). Then it is natural to introduce the 
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notation 
Anz = An (Anz). Afinz = Anr (Anz) ete. 


The connection between partial difference quotients and the 
derivatives is expressed by the formulas 


sac. ARA a Apz a ` Apnz š z Ae 
z= lim =E T zm lin z= lim 7 meee lira he 
ho t k>0 Avo a h0 hk 
—l 


and the like. 

15. Equality of Mixed Derivatives. Let z = f (x, y). Then the 
function has four partial derivatives of the second order, namely 
Zax» Sey Zyx and Z,,. It turns out that the derivatives 2), and 2). 
which are called mixed partial derivatives are equal: 


Bay = Zy (21) 


that is the mixed derivatives are independent of the order in which the 
differentiation is performed. (This is true in case 2, and 27. are con- 
tinuous.) 


To prove formula (21) it is sufficient to observe that, according 
to the end of Sec. 14, 


” S 1 $ A 1 
Zay= lim —_—Ajnz, Zyx= lim — 


Nin 22 
h, ka Ak nee A (22) 


At the same time 


Air Z = Ap (Anz) = Alf (£ +h, y) — f (z, yl = 
= [F (x +h, y +k) — f (z, y + k) — If (£ + h, y) — 
— Í (z, y) =f (£ +h, y + k) — f (z, y +k) — 
—fa@+th, Y +f, y), 

Ainz = An (Anz) = An If (z, y + k) — f (x, yl = 
— f(z, y) = f(z +h, y +k) — f(t +h, y)— 


i.e. 
Aik 2 = Ajinz 


and hence the mized differences are independent of the order in 
which they are calculated. From this and from (22) it follows now 
that formula (21) is true. 

If now we consider the derivatives of orders higher thangthe second 
of a function it is permissible, in accord with formula (21), to inter- 
change any two subsequent operations of differentiation carried out 
with respect to any variables. Thus, we can pass from any order 
of performing the differentiation to any other order. Hence, only 
the number of differentiations with respect to the corresponding 
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variables is essential here but not the order in which the operations 
are performed. For instance, 


uly uly. etc. 


IV IV ANEAN ee 
u u =u xzyx — “zxyx 


xxyz ~ “xyxz xyzx 
but at the same time these derivatives are unequal to DEA 

Partial differentials [see formula (20)] are also independent of the 
order of differentiating a function. 

16. Total Differentials of Higher Order. The total differential of 
any given order is defined as the total differential (see Sec. 14) of 
the total differential of the preceding order. As before (see Sec. IV.12), 
the differentials of independent variables should be regarded as 
constant quantities in subsequent differentiations. For example, 
let z = f (z, y). Then, 

dz = z% dz + Zy dy, 
dz = d (dz) = (2% dx + Zy dy)x dz + (zx dx + zy dy)y dy = 
= ziy dz? + 20, dy dx + Zn, dx dy + 2yy dy” = 
= ah, da? + 22%, dx dy + Zyy dy? (23) 
Here we have used formula (24). Further, we have 


d3z = (zix dx? + 2Zxy dz dy + 2yy dy?) dz + 
“+ (zix da? + 2zry dx dy + Zyy dy’)y dy = 
= ("xx da? + 2ZYpy dT? dy + Zxyy dx dy?) + 
+ (Zixy dx? dy + 22zyy dz dy? + 2yyy dy?) = 
= eee d£? + Bahay dz” dy + 3zzyy dx dy? + Zyyy dY? 

We see that here, as well as in deducing formula (IV.32), the cal- 
culations are performed according to the scheme in which we succes- 
sively remove the brackets in the expressions (a + b)?, (a + b)? ete. 
In the general case we have 


on gn Y n ang iy 
dz = dat (1) orao ay a0 dy + (6 ) orap ae” he. 
grz n 
ootia 


This result can be written in the following symbolical form: 
a ð yn 

CAd (darz +w 7) Z 
In the right-hand side of the last formula the brackets should be 
removed as if 0, ôx, dy, dz and dy were ordinary algebraic factors. 
In a similar way, if 

hen d” = (dr +d Pavan)" 

u= f (x, y, 2) then d u= aa Vy 2 =) u 


and so on. 
20—01414 


LA 
z2 
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If z = f (z, y) but z and y are no longer independent variab- 
les we must change formula (23) in a manner similar to the one 
used in the end of Sec. IV.12: 

dz = d (ede + zy dy) = d (Er de) + d (zp dy) = d (z4) dz + 
+ aid (dz) + d (zy) dy + zyd (dy) = (Zzx d£ + Zxy dy) de + 
+ ze Pa + (zys de + Zyy dy) dy + Zy Py = 
= 2%, dx? + 2z%y dx dy + Zyy dY? + Za 0a + Zy dy (24) 
The expressions for subsequent differentials are changed in a simi- 
lar way. í 


CHAPTER X 


Solid Analytic Geometry 


§ 1. Space Coordinates 


{. Coordinate Systems in Space. Besides Cartesian coordinates 
described in Sec. VII.9, the following coordinate systems are widely 
used. 

1. Cylindrical coordinates p, ọ and z are shown in Fig. 201. To 
construct a cylindrical coordinate system we choose polar coordi- 
nates (in the z, y-plane) and add the third coordinate z to them. 


0 Z 


jg 


P 


Fig. 201 Fig. 202 


Obviously, for all the points in space to be described, it is sufficient 
that we take the following ranges for p, pọ and z: 0< p< œ, 
—n<g<m and o <Z< ©. 

The coordinate surfaces, that is surfaces on which one of the coor- 
dinates is constant whereas the other two vary, form three families 
of surfaces, namely p = const, @ = const and z = const. These 
surfaces are depicted in Fig. 202. All these surfaces are of course 
regarded as being extended to infinity. The coordinate curves, that 
is curves on which two coordinates are constant whereas one of the 
coordinates varies, constitute three families of curves which are 


20* 
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formed by the intersection of the coordinate surfaces. The coordinate 
curves are shown in heavy lines in Fig. 202. (By the way, the coor- 
dinate surfaces of a Cartesian coordinate system are the planes 
parallel to the planes zOy, yOz or 2O0z, and the coordinate curves 
are the straight lines parallel to the coordinate axes Ox, Oy or Oz.) 

Let us take the Cartesian coordinates (x, y, z) and the cylindri- 
cal coordinates (p, @, z) which are placed as it is shown in Fig. 203. 


z 


P F 


Fig. 203 Fig. 204 


Then the relationship between the coordinates is expressed by the 
formulas z = p cos pọ, y = p sin ọ and z =z. 

Cylindrical coordinates are often applied to investigating solids 
and surfaces of revolution (such as a circular cylinder or a circular 
cone etc.), and the z-axis is placed along the axis of revolution in 
such investigations. 

2. Spherical coordinates (which are sometimes called spatial polar 
coordinates) are shown in Fig. 204. These coordinates are analogous 
to the geographical coordinates. The distinction between them 
is that the “latitude” O is reckoned here from “North Pole” whereas 
in geography it is reckoned from the equator. To describe all the 
points in, space it is sufficient to take r, 0 and @ within the limits 
0O<r<o, 00a and —n<g<cnu. 

The coordinate surfaces and curves are shown in Fig. 205. If 
Cartesian coordinates (x, y, z) and spherical coordinates (r, 0, ©) 
are located as it is shown in Fig. 206 then the relationship between 
them is expressed by the formulas 


= p cos ọ = r sin 0 cos Q, 
y = p sin ọ = r sin 0 sin g, 
2 =f: cos 6 
Spherical coordinates are especially convenient for investigating 


solids bounded by surfaces of the type shown in Fig. 205 but they 
are also applied in many other cases. 
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Cartesian, cylindrical and spherical coordinates are particular 
cases of the so-called orthogonal coordinates in which any two inter- 
secting coordinate curves form a right angle (check up this property 
for the above coordinate systems!). Some non-orthogonal coordinate 
systems are also applied to certain problems (for instance, general 
affine coordinates mentioned in Sec. VII.9). 

2. Degrees of Freedom. We have seen that different coordinate 
systems can be introduced in space. A general property common to 

` all the systems is that the 
position of a point in space 
is specified by three coordina- 
tes whereas the position of 
a point in a plane is specified 
by two coordinates and a point 
on a curve by one coordinate. 
This property can be expres- 
sed in the following terms: 


r=const 


Fig. 205 Fig. 206 


or when a point moves in space whereas there are two degrees of 
freedom when we choose a point belonging to a plane (and also to 
an arbitrary surface) and one degree of freedom when we consider 
a point on a curve. Or, in other words, the geometric space 1S 
three-dimensional whereas surfaces are two-dimensional and 
curves are one-dimensional. ; 

In the general case the notion of a degree of freedom is introduced 
in the following way. Let there be a certain set of objects (in the 
above example such a set was the totality of all the points in space). 
Suppose that each of the objects can be specified by indicating nume- 
rical values of some parameters (in the above example such para- 
meters were the coordinates of a point). Let these parameters satisfy 


the following requirements. 
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(14) The parameters are independent, that is they can assume 
arbitrary values. For instance, if we fix all the parameters but one 
then this single parameter can be varied arbitrarily (sometimes 
within certain limits). 

(2) The parameters are essential, that is any variation of the 
parameters leads in fact to certain changes of the object in question. 

If these conditions are satisfied and if there are k such parameters 
then we say that we have k degrees of freedom in choosing an object 
from this set. The set of objects in question is then called a k-di- 
mensional space (generalized space) or a k-dimensional manifold. The 
parameters are called coordinates (generalized coordinates) in the 
space. As in the case of ordinary coordinates, generalized coordinu {es 
can be chosen in different ways, and a specific choice of coordinates 
is usually made so that it should be convenient for the investigation. 
The objects which constitute a space are called its elements or points. 
Hence, a many-dimensional space acquires a concrete interpretation. 

The above definition of a dimension is in agreement with the 

definition of the dimension of a linear space given in Sec. VII.19 
because in such a space the coefficients of the resolution of a vector 
with respect to a fixed basis can be taken as parameters. But now 
we consider spaces which belong to a more general class, and the only 
connection between the elements of such a generalized space is 
that these objects are taken from a certain set. We can introduce 
the notion of closeness of the elements if we consider any elements 
whose parameters are close to each other as being close in the space 
in question. If such a notion is introduced in a space we can easily 
define the notion of a limit in this space. Such a space in which the 
notion of closeness of its elements is introduced (or, which is the 
same, the notion of passage to a limit is defined) is called a topological 
space. 
À Let us consider some examples. Let the set of all the circles lying 
in a plane be considered. Each of the circles is completely determined 
by the numerical values of three parameters, namely by the coor- 
dinates (x, y) of its centre and by its radius r. These parameters are 
independent (each of them can be varied arbitrarily) and essential 
(every variation of one of the parameters leads to a certain variation 
of the circle in question). Therefore, when we choose a circle in 
a plane we have three degrees of freedom, and hence such a set of 
circles is a three-dimensional generalized space with the coordinates 
x, y and r. Similarly, the set of all spheres in space is a four-dimen- 
sional space. 

In physics we usually consider the set of events. An event is 
completely characterized if we can answer the questions “where does 
the event take place?” and“when does the event occur?”. We can answer 
the first question by indicating the corresponding coordinates, for 
instance, the Cartesian coordinates z, y and z. The second question 
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is answered if we indicate the corresponding moment of time 7. 
The space of events is therefore four-dimensional, and we can choose 
the quantities z, y, 2 and ¢ as generalized coordinates in the 
space. 

One more example: what is the number of degrees of freedom when 
a line segment of given length Z moves in space? Each segment of 
this kind is completely specified by the coordinates (x,, yj, 21) and 
(£a, Ya, Z2) of its end-points. These coordinates can be taken as para- 
meters which specify the position of the segment. These parameters 
are obviously essential but not independent since they are connected 
by the equation 


implied by formula (VII.14). Hence, only five parameters can be 
regarded as being independént. If we arbitrarily choose five of the 
parameters then the sixth parameter is expressed in terms of the 
five parameters with the help of the above relation. Thus, a line 
segment of given length has five degrees of freedom when it moves 
in space. 

Generally, if we have n parameters which are essential but are 
connected by m independent equations (that is by equations such 
that none of them is implied by the others) then we can choose 
n — m parameters as independent parameters, and the remaining 
m parameters will be expressed in terms of the former. Hence, there 
will be n — m degrees of freedom. For instance, when a triangle 
moves in space we have 9 — 3 = 6 degrees of freedom (check up the 
result!), This example is important in connection with the fact 
that the position of a rigid body (“perfectly rigid body”) is completely 
defined if we indicate the positions of its three points which do not 
lie on the same straight line (why?). Consequently, we have six 
degrees of freedom when we investigate the motion of a rigid body 
in space. 

rs addition, let us find the number of degrees of freedom when an 
infinite straight line moves in space. We can reason in the following 
way: if we choose two arbitrary points A and B in space (each of 
the points is defined by its three coordinates) and draw a straight 
line (P) passing through the points we shall have six parameters 
specifying the position of the line. Since these parameters are obvio- 
usly independent one may think that there are six degrees of free- 
dom here. But such a conclusion is wrong because there are cases 
here when a certain variation of the parameters does not change the 
position of the straight line although it makes the points A and 
B move (along the line (P) when it occupies a fixed position). Hence, 
the condition that the parameters should be essential does not hold 
here. When the point A slides along the straight line (P) it has one 
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degree of freedom and when the second point B slides along (P) 
it also has one degree of freedom. Such motions of the points do not 
affect the position of the line and therefore in the above calculation 
there are two unnecessary degrees of freedom which must not be 
taken into account. Thus, in fact the number of degrees of freedom 
is equal to 6 — 2 = 4. For example, we can choose the coordinates 
of points of intersection of the straight line (P) with the planes 
xOy and yOz as independent and essential parameters. By the way, 
not all straight lines intersect these planes but this fact does not 
affect the validity of our general consideration concerning the cal- 
culation of the number of degrees of freedom. 

In a k-dimensional space there may exist some subsets of points 
which form subspaces (submanifolds) of the same dimension or of 
a lower dimension. If a point happens to belong to a subspace (S) 
of dimension k — 1 this can be regarded as an “extraordinary event” 
because in such a case the generalized coordinates a, Œo, . -= On 
of the point must satisfy a relation of the form fg (a1, a, . . -» an) = 
= 0. (Of course, we sometimes intentionally consider the motion 
of a point along a certain submanifold and then there is no reason 
to regard this as an extraordinary fact.) The typical case (basic 
case, general case) is when a point (taken at random) does not belong 
to (S), that is the case when the inequality fs 0 is fulfilled. If 
the inequality is fulfilled for certain values of the coordinates then 
it remains true when the coordinates vary but their variations are 
sufficiently small. On the other hand, if the equality fg = 0 is ful- 
filled for certain values of the coordinates then this condition can 
be violated even when the variations of the coordinates are arbi- 
trarily small. Therefore a property which is characterized by ine- 
qualities connecting the coordinates of a point is stable (structurally 
stable) with respect to variations of the coordinates. On the contra- 
ry, a property expressed by equalities is unstable. If the coordinates 
vary in such a way that fs continuously passes from its negative 
values to the positive ones then in an intermediate position we have 
fais 0, that.is the point turns out to be on (S) at this moment. From 
this point of view the fact that a point taken at random turns out 
to belong to a submanifold of dimension k — p < k — 1 (opi 
< k) is still rarer because in such a case certain p relationships 
having the form of an equality must be fulfilled. 

For instance, let us consider system (VI.1) of two equations of 
the first degree in two unknowns. The space whose elements are 
such systems is six-dimensional since every system of this type is 
defined by the six parameters a,, b,, d,, as, b, and d, which can be 
taken as the coordinates of the system. The singular case described 
in Sec. VI.6 is characterized by the equality 


D = ab, — a,b; = 0 
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The basic case D 0 should therefore be regarded as stable here 
whereas the singular cases are unstable. The subspace (S) of sin- 
gular cases is five-dimensional. Among the systems belonging to 
(S) there are systems which have infinitely many solutions, and 
they form a four-dimensional subspace of (5) (why is it so?). A typi- 
cal system belonging to (S) is therefore inconsistent (contradictory). 

After analogy with Sec. IX.9 we can introduce the concept of a 
field defined on a k-dimensional manifold. If we choose certain 
coordinates on the-manifold then such a field turns into a function 
of k variables. Quantities represented by functions of several vari- 
ables can therefore be either originally defined as quantities depen- 
ding on k independent variables or can turn into functions of several 
variables after some coordinates have been introduced in the mani- 
fold on which these quantities were originally defined as fields. 

We remark in conclusion that there are cases when parameters 
can assume arbitrary complex values; then we can speak about a 
“complex dimension”. Every complex parameter having an arbitrary 
real part and an arbitrary imaginary part, the “complex dimension k” 
corresponds to the “real dimension 2k”. 


§ 2. Surfaces and Curves in Space 


3. Surfaces in Spaces We have shown (see Sec. IX.5) that an 
equation of the form 


F(z, y, 2) =0 (1) 


defines a surface in the z, y, z-space (that is in the geometric space 
in which a Cartesian coordinate system with the coordinates z, y 
and z is introduced). Such a surface which we designate by (5) 
here is the locus of points whose coordinates satisfy given equation 
(1). Equation (1) is then called the equation of the surface (S). Con- 
versely, if a surface (S) in the z, y, z-space is originally given then 
we can obtain its equation (1). For example, if we have a sphere of 
radius R with centre at the point (a, b, c) then reasoning as we 
did in Sec. 11.4 and taking advantage of formula (VII.14) we readily 
deduce the equation of the sphere: 


e—a + y bF Eeo —R?=0 


The equation of a surface can also be written in other coordinate 
systems. For instance, it has the form ® (r, 0, p) = 0 in spherical 
coordinates. 

By analogy with Sec. II.4, we see that in order to find the points 
of intersection of three given surfaces whose equations are repre- 
sented in form (1) we have to solve a system of three equations in 
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three unknowns of the form 


VCS VEYE — 0 
F, (x, yY: ) = ob 
F; (z, y, 2) = 0 


The notions of algebraic and transcendental surfaces are introdu- 
ced as it was done in Sec. II.7 for plane curves. As in Sec. II.8, there 
can also be singular cases here: imaginary surfaces, degeneration of 
a surface and disintegration of a surface. It should be taken into 
account that in Sec. II.8 we considered cases when a curve could 
degenerate into a point, but a surface can degenerate not only into 
a point but also into a curve. For instance, a “sphere of zero radius” 
is nothing but a point whereas an infinite “circular cylinder of zero 
diameter’ is a straight line etc. 

4. Cylinders, Cones and Surfaces of Revolution. For example, 
let us take the equation z — z? = 0. If we consider the correspon- 
ding geometric figure as a plane curve then the equation represents 
a parabola (L) with the equation z = z? lying in the z, z-plane. We 
see that point O with the coordinates z = 0 and z = 0 and the point 
A with the coordinates z = 2 and z = 4 belong to the parabola. 


Fig. 207 Fig. 208 


But we can consider the same equation with respect to the 
spatial coordinate system z, y, z and then we obtain a cylindrical 
surface depicted in Fig. 207 for which the relation z — z? = 0 is 
its equation. The parabola serves as the directing curv? of the cy- 
linder, and its elements are parallel to the y-axis. This is a parabolic 
cylinder. Indeed, besides the point A (2, 0, 4), the points (2, 5, 4) 
and (2, —8, 4) also belong to the surface. Moreover, all the points 
with the coordinates (2, y, 4) belong to the surface for an arbitrary 
y since these coordinates satisfy the equation under consideration 
because the equation does not involve y and therefore y can be arbi- 
trary and it is only z and z that should satisfy the equation. But these 
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points cover the whole straight line shown in Fig. 207 in heavy line. 
All other points of the parabola (L) can be treated in like manner. 

Similarly, any equation of the form F (z, z) = 0 is an equation 
of a surface in the z, y, 2-space which is a cylindrical surface whose 
elements are parallel to the y-axis and whose directing curve lies 
in the plane Oz and is represented in this plane by the same equation 
F (z, z) = 0. Accordingly, equations of the form © (z, y) = 0 
or ¥ (y; z) = 0 are the equations of cylindrical surfaces with ele- 
ments parallel to the axis Oz or Oz, respectively. For instance, the 
equation z? + y? = R? represents a right circular cylinder of radius 
R whose axis is the axis Oz (the same equation defines a circle in 
the plane z0y). 

Now let us consider an equation of form (1) under the assumption 
that the function F is homogeneous (see the end of Sec. IX.12). Let 
us prove that such an equation is the equation of a conic surface 
whose vertex is at the origin. Actually, suppose that a point A 
with coordinates (£, y, 2) belongs to the surface in question (see 
Fig. 208). Then F (z, y, 2) = 0 because the coordinates of the point 
A must satisfy the equation of the surface. Now let us take any point 


B with coordinates (tz, ty, tz) where ¢ is an arbitrary positive 
number. Then 


F (tz, ty, 2) =t'F @ y, 2) = #0 =0 


which means that the point B also belongs to the surface. But if 
we make ż vary from 0 to oo then the point B will run along the whole 
ray Z which thus belongs to the surface. Hence, the surface in ques- 
tion contains, together with each point A belonging to the surface, 
the whole ray J. This implies that the surface is conic. More preci- 
sely, it is a “semi-cone”, we can obtain a cone if it is permissible 
to substitute negative values of ¢ into identity (IX.12). For example, 
the equation ar? + by? — ca? = 0 (where a, b and c are positive) 
is the equation of a cone. The line of intersection of the surface with 
the plane z = 1 being an ellipse (check it up!), the surface is an 
elliptic cone whose axis is the axis Oz. 

In conclusion let us consider the equation of a surface of revo- 
lution, For example, let a curve (L) lying in the plane yOz and 
having an equation of the form F (y, z) = 0 be rotated about the 
z-axis. Let us deduce the equation of the surface thus obtained (see 
Fig. 209). To do this we take an arbitrary point M (z, Y, z) on 


the surface and consider the corresponding point M (z, y, 2) belon- 
ging to the curve (L). Then we have z = z and x = 0. To compute 


y we remark that y KM = KM = Vx? + y? (check it up!). 


The point M lying on (L), we have F (y, 2) = 0, i.e. F (VE F y, 2)= 
— 0. It is the last equation that is the equation of the surface of 
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revolution in question. For example, the equation z = ay? is the 
equation of a parabola lying in the plane yOz. Therefore the equation 
z = a (£? + y?) is the equation of the corresponding paraboloid 
of revolution. 

5. Curves in Space. A curve in space can be regarded as a line 
of intersection of two surfaces (that is as the locus of points common 


z i) 


(x,y,z) 


M (x,y, 0) 


g 
Fig. 209 Fig. 240 


to both surfaces) or as a trace (trajectory) of a moving point. In the 
- first case the equations of both surfaces can be put down in the form 


Fi (z, y, 2) =0 9) 
Je nai #7) 


Since the points of the line of intersection of the surfaces belong 
simultaneously to both surfaces this line is the locus of points whose 
coordinates simultaneously satisfy both equations (2). Hence, 
system (2) should be regarded as a system of two equations in three 
unknowns. 

From the point of view of the second approach the equation of 
a curve has a parametric form 


=o), y= yp), 2=x (i (3) 


(see Sec. VII.23). To pass from form (3) to form (2) we must eliminate 
t from equations (3) (for instance, we can express ¢ in terms of x 
from the first equation and then substitute the result into the other 
two equations) if it is required and if it is possible. To perform the 
reverse transition from (2) to (3) (on the same conditions) we can 
denote one of the variables by ¢ and then solve equations (2) for 
the other two variables. For instance, we can substitute z = t 
and then solve the equations for y and z expressing these variables 
as functions of t. 
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We sometimes encounter the problem of. finding the projection 
(L') of a given curve (L) lying in space on one of the coordinate planes. 
As an example, see Fig. 210 where the projection on the plane zOy 
is shown. To solve the problem we must determine the relationship 
between the coordinates x and y of the points belonging to (L). If 
(L) is represented by equations (2) then in order to find (L’) we must 
eliminate z from the equations. If (L) is represented parametrically 
in form (3) then we can simply retain the first two equalities. 

6. Parametric Representation of Surfaces in Space. Parametric 
Representation of Functions of Several Variables. It was shown in 
Sec. 5 that to obtain a parametric 
representation of a curve we must 
introduce one parameter [see for- 
mula (3)]. Now let us find out the 
geometric meaning of equations of 
the form 


r= pu, v) Y = Ņ (u, v), 
z=y (u, v) (4) 


containing two independent para- 
meters u and v which can take /y 
on arbitrary numerical values. It i 
is natural to expect that such 
equations represent a surface in 
space since a point on a surface has two degrees of freedom and 
therefore in order to specify such a point two parameters should be 
indicated. 

To justify this supposition let us take any two equations (4), 
for instance, the first two equations. Generally speaking, we can 
express u and v from these equations (at least theoretically) in terms 
of « and y. Thus we obtain expressions of the form u =u (z, y) 
and v.= v (a, y). If we then substitute them into the third equation 
(4) we shall arrive at an equation of the form z = z (z, y) which, 
as it was shown in Sec. IX.1, represents a surface in space. Conse- 
quently, equations (4) represent a surface in space in parametric 
form. 

We shall show in Sec. X1.14 that there are some singular cases 
when the above transition from (4) to the formula z = z (z, y) is 
impossible. These are the cases when the surface in question degene- 
rates into a curve or into a point. For instance, the “surface” x = 
=u +v, y = 2u+ 2, z=1—u—v is in fact a line (why is 
it so?). 

A sketch of a surface (S) represented by equations (4) is depicted 
in Fig. 211. If we make u assume different constant values and if 
we vary v then we obtain different curves lying on (S), each of these 
curves being completely specified by the corresponding constant 


2 
Curves u=const ' 


Curves v=coust 


Fig. 211 
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value of u. This is so because when u is fixed and only v is varied 
we have a single parameter, i.e. only one degree of freedom. Similarly, 
putting v = const we obtain another family of curves on (S). These 
curves can be taken as coordinate curves on (S), and the parameters 
u and v can be regarded as coordinates on (S). 

Equations (4) define a certain functional relationship between 
x, y and z. Actually, if the values of x and y are given we can (at 
least theoretically) determine the corresponding values of u and v, 
as it was done above. The values u and v thus found yield the cor- 
responding value of z. Hence, we obtain a function z = z (2, y) 
which is originally represented in parametric form (4) and whose 
graph is the surface (S) considered above. 

In the general case of an arbitrary number of variables the para- 
metric representation of a functional relationship is introduced in 
the following way. Let there be given the equations 


Sp — fa (His ta e =) Dm) 
Ta = fa (tt, ta, -= «> tm) 


(5) 

In = Ta (ti; bare as tm) 
where the variables 4, t, . . . tm are considered to be parameters. 
If m < n then choosing m equations we can express ti, tg, .. ++ tm 


from these equations in terms of the corresponding variables «. 
This is usually possible with the exception of some cases of degene- 
ration which will be discussed in Sec. XI.14. Substituting the ex- 
pressions thus obtained into the remaining equations (5) we get a repre- 
sentation of n — m variables of type z as functions of the other m va- 
riables z. Hence, we can say that equations (5) represent an m- 
dimensional manifold lying in the n-dimensional space. When we 
choose a point belonging to such a manifold we have m degrees 
of freedom. In those cases when there is a degeneration the dimen- 
sion of the manifold turns out to be less than m. If m > n then, 
generally speaking, equations (5) do not define any functional 
relationship between the variables z. 

The derivatives of functions represented parametrically are found 
by analogy with Sec. IX.13 where we differentiated implicit func- 
tions. For example, let a function z = z (x, y) defined by formu- 
las (4) be considered and let it be necessary to compute the derivative 
zx. Rewriting the first two equations (4) in the form ọ (u, v) — x 
= 0, p (u, vV) — y = 0 we can find wu, and vy as it was done for 
equations (IX.17) (by the way, in practical calculations this can 
be performed without rewriting the equations). The condition 
guaranteeing the possibility of such computations is 
Pu P| 

+0 


, 


pu Yo 
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Differentiating the last equation (4) we obtain the formula Zk = 
= yur +XVx Finally, substituting the values u% and vx found 
above into the formula we obtain the desired expression of Zx. Star- 
ting from this result we can find the derivatives of higher order by 
means of repeated differentiation. 


§ 3. Algebraic Surfaces of the First and of 
the Second Orders 


7. Algebraic Surfaces of the First Order. The general form of an 
equation of a surface of the first order is put down as 


Ac+By+Cz+D=0 (6) 


(comparé with Sec. 11.9). To find out what surface is defined by such 
an equation let us introduce the vector 


a = Ai+ Bj+ Ck (7) 


Then equation (6) can be rewritten in the form a-r + D = 0 [see 
formulas (VII.7) and (VII.12)] where r is the radius-vector. But 
a-r =a proja r [see formula (VII.4)] which implies 


aprojar+D=0, i.e. proja r = — 2 


Thus, the surface is the locus of all points M for which the projec- 
tions of their radius-vectors on the constant vector a have the con- 


stant value 2 . Fig. 212 shows that this is a plane (P) which is 


perpendicular to the vector a. 

Hence, surfaces of the first order are planes. 

Let us consider some simple problems. 

(1) Investigate in what way variations of the values.of the coef- 
ficients A, B, C and D affect the position of the plane (P). This can 
be seen from Fig. 212. For instance, if we vary D retaining some 
constant values of A, B and C the plane will be in a translatory 
motion. In particular, for D = 0 it passes through the origin of 
coordinates. Variations of the coefficients A, B and C result in 
rotating the vector a and, consequently, in rotating the plane (P). 
If A = 0 then the vector a lies in the plane yOz, and the plane (P) 
is therefore parallel to the a-axis. If, in addition, D = 0 the plane 
will pass through the z-axis. 

The case when other coefficients turn into zero are investigated 
similarly. s 

Let us also put down the equations of the coordinate planes: 
z = 0 is the equation of the plane zOy, z = 0 is the equation of 
the plane yOz and y = 0 is the equation of the plane zOz. 
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(2) Let it be necessary to draw a plane which is perpendicular 
to a given vector (7) and passes through a given point (x, y, 2)s 
As for the first problem in Sec. II.9, we deduce from (6) the follow- 
ing answer: 

A(x — a) +B(y—y1)+C@—%) = 0 

(3) Determine the angle p between two given planes. Let the 
equations of the planes be 
Ay + By +Cz+D,=0 and A. +B +C +D = 0 (8) 
Then either the angle @ is equal to the angle between the vectors 

a =Aji+Bj+Ck and a, = A,i + Bj + Ck (9) 


(which are perpendicular to the planes) or these angles arẹ supple- 
mentary angles (because the arms of these angles are mutually per- 


Fig. 212 Fig. 213 


pendicular, as it is seen in Fig. 213). Hence, the cosines of the angles 
are either equal or differ only in their signs. Computing the angle 
between the vectors we obtain 


cos p=+ A MArtBB HOC 
V APT BET Ct V AB BEF C3 


(4) The condition for the parallelism of two planes (8) is put 
down as 


It is implied by the analogous condition for the corresponding vectors 
(9) (see problem 2 in Sec. VII.10). But if 


equations (8) are equivalent, that is the planes coincide. 
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(5) The line (straight line) of intersection of two planes (8) is 
represented by two equations (8) if we consider them as a system 
of simultaneous equations. We can pass from the system to para- 
metric form (VII.33), as it was described in Sec. 5. 

Let us illustrate such a transition by taking a concrete example. 
Let a straight line be given as the intersection of two planes with 
the equations 


z—2y+2—3=0 
ff Tea 
Designating z = t we obtain 
z — 2y = —t hae 
2z 4+- y= —4t+5 


K if 9 
Solving the system we find z SN y= -45t and 


5 
z=. 
Hence (see Fig. 173) the straight line in question passes through 
the point (2, -5> 0) and is parallel to the vector b = — = i — 
2 
—ij+k. 
The problems involving straight lines and planes can often be 


solved by means of such transition on the basis of properties of 
vectors, 


() 


( 


(6) 


Fig. 215 
(a) There is no common point 
(ò) Common straight line 


Now we can easily give the geometric interpretation of different 
cases which can occur in solving system (V1.5) of three equations 


2141—0141 
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in three unknowns. Each of the equations is the equation of a plane 
in the z, y, z-space, and hence the problem reduces to finding the 
point of intersection of the corresponding planes (P;), (P2) and 
(P3). The determinant D of the system is equal to the triple scalar 
product of the three corresponding vectors which are perpendicular 
to the planes (see Sec. VII.15). If D ~ 0 these vectors are not paral- 
lel to the same plane, and the plane (P;) will therefore intersect 
the line of intersection of the planes (P,) and (P.) at a single point. 
Hence system (VI.5) has a unique solution. If D = 0 the vectors 
are parallel to a plane T (see Fig. 214). This implies that the planes 
(P,), (Ps) and (P) are parallel to a straight line (J), and therefore 
they either have no points in common at all or have infinitely many 
common points which constitute a straight line parallel to (J). 
In the first case the system has no solutions and in the second it 
has infinitely many solutions (the whole “straight line of solutions”). 
Possible dispositions of the planes are shown in Fig. 215 for both ` 
Rie (Let the reader think what other dispositions can be found 

ere.) 

8. Ellipsoid. We shall begin with the canonical equation of an 
ellipsoid without giving its geometric definition. The equation is 
of the form 


2 2 2 p 
gtd- a 


pipe a, b and c are positive constants called semi-axes of the ellip- 
soid. 

By analogy with Sec. II.10, we can easily verify that | x | <a, 
ly |<b and |z|<c, that is an ellipsoid is a finite, bounded 
surface, that the planes zOy, yOz and zOz are its planes of symmetry 
and that the origin of coordinates is its centre of symmetry (the 
centre of the ellipsoid). 

To investigate the form of an ellipsoid let us apply the method 
of parallel sections. The method consists in investigating the curves 
of intersection of the surface in question with the coordinate planes, 
that is with the planes whose equations are of the form x = const, 
y = const and z = const. Let us first consider the curve of inter- 
section of our ellipsoid with the plane z = hk which is parallel to 
the plane zOy. For this purpose let us put z = h in equation (10) 
which results in 


2 y 
ahi o 


or 


— Aa RM 
h' 2\2 
(a t) (0 1—5) 
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Hence, we obtain an ellipse with semi-axes a V 4 a and 
Cc 


b V1 = i Thus, for k = 0 we have an ellipse with semi-axes a 


and b. When |h | is increased the ellipse decreases but remains 
similar to the original ellipse because the ratio of its semi-axes 
is constant. When h assumes the values h = +c the ellipse dege- 
nerates into a point since its semi-axes become equal to zero. The 
investigation of the curves of intersection of the ellipsoid with 
the planes y= hand z = h yields similar results. Hence we con- 
clude that the ellipsoid has the form shown in Fig. 216. 

If two semi-axes are equal, for instance, if a = b then the curves 
of intersection with the planes z = h are circles. Hence, in this 


Fig. 216 


case we obtain an ellipsoid of revolution, that is a surface generated 
by the revolution of an ellipse about one of its axes, instead of a 
triaxial ellipsoid which we have in the general case when the axes 
are unequal. An ellipsoid of revolution is also referred to as a sphe- 
roid. When a spheroid is generated by revolving an ellipse about 
its major axis it is called a prolate ellipsoid of revolution (it resem- 
bles an egg). If an ellipse rotates about its minor axis we have an 
oblate ellipsoid of revolution. Finally, if all the three axes are equal 
to each other the ellipsoid turns into a sphere. 

By analogy with Sec. 11.10, we can easily prove that a triaxial 
ellipsoid can be obtained by performing uniform contraction (or 
stretching) of a sphere towards two coordinate planes. The con- 
traction towards one of the coordinate planes necessarily results 
in a spheroid. At the same time, Sec. XI.6 implies that when per- 
forming uniform contraction of an ellipsoid we again obtain an 
ellipsoid. 


21+ 
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9. Hyperboloids. There are two types of hyperboloid. A hyper- 
boloid of one sheet is represented by the canonical equation 
ph y2 z2 
ate a 1 (11) 
The section of the surface by the plane z = h yields an ellipse whose 
3 eS TA c h2 3 i 
semi-axes are a yi aia sa and b Vv H = (check it up!). Hence, 


for k = 0 we have an ellipse with semi-axes a and b. When |% | 
increases the sizes of the ellipse also increase and tend to infinity, 
as |h |-> oo. All the ellipses are similar because the ratio of their 


semi-axes equals + = const. We similarly verify that the sections 


by the planes y = h and x = h are hyperbolas. Hence we obtain 
a surface which is depicted in Fig. 217. A hyperboloid of one sheet, 
like an ellipsoid, has three planes of symmetry and a centre of 
symmetry. 

If a = b we have a surface which is generated by revolving a 
hyperbola about its conjugate axis, that is a hyperboloid of revo- 
lution (of one sheet). 

There is another way of interpreting a hyperboloid of one sheet. 
Let us first take the case when the hyperboloid of one sheet having 
equation (11) is a surface of revolution, that is when a = b. [Equa- 
tion (11) turns into 


z2 y2 22 
aos geet eB) 


for this case.] Take the plane y = b = a. Let us consider the section 
of the hyperboloid in question. by this plane. For this purpose we 
substitute y = b =a into equation (11) which results in 


An Nees er 
Te. 
x z 


——>=0 and Z4Ż=0 (y =b) 


a c 


Hence, the curve of intersection disintegrates into a pair of straight 
lines = AEE 0,.4 => and 2 +7 = 0, y =b which inter- 
sect at the point A (0, b, 0) (the lines are shown in Fig. 218). Be- 
cause of the axial symmetry we have the same form of section by 
any plane which is parallel to the z-axis and which touches the 
“gorge” circle (the ellipse corresponding to the section z = k = 0 
is called the gorge ellipse; in the case a = b we thus have the gorge 
circle). Consequently, the whole hyperboloid is entirely made up 
of these straight lines forming two families of straight lines, as it 
is shown in Fig. 218, because through each of its points there pass 
two straight lines which lie entirely on the surface of the hyperboloid. 
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This property is analogous to the properties of a cylinder or of a 
cone which are also made up of straight lines (their elements), but 
in the latter cases the straight lines belong to a single family of 
lines. Incidentally, we see that a hyperboloid of revolution (of one 
sheet) can be generated by revolving one of the two skew lines about 
the other. Now, passing to the general case of a hyperboloid of one 
sheet when the parameters a, b and c entering into (11) can take on 


Fig. 217 Fig. 218 


arbitrary positive numerical values we see that such a hyperboloid 
can be obtained from a hyperboloid of revolution if we uniformly 
contract the latter. But it is obvious that straight lines pass into 
straight lines under such a deformation, and therefore we conclude 
that a hyperboloid of one sheet of the general form is also made up 
of straight lines forming two families of straight lines. In conclusion, 
let us note that the plane depicted in Fig. 218 is the tangent plane 
to the hyperboloid at the point A. In fact, the tangent plane passing 
through a point A belonging to an arbitrary surface (S) is, by de- 
finition, a plane which touches any curve lying on (S) and. passing 
through the point A. Therefore the tangent plane to our hyperboloid 
must pass through both straight lines which entirely lie on the 
hyperboloid and intersect at the point A. Hence, we see that a 
tangent plane to a surface may intersect the surface along two dis- 
tinct lines. 

The canonical equation of a hyperboloid of two sheets has the form 


z2 y2 o, 2 
ic a 
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The section by the plane z=h is an ellipse with semi-axes 


Rae, ha 
aA and Wager 


The plane z = h does not therefore intersect the surface for | h |< — 
<c. For |h |= c, that is for k = +c, we obtain ellipses whose 
axes are equal to zero, that is single ` 
points. When || increases from c to 7 
infinity the sizes of the corresponding 
ellipses also tend to infinity. The ratio | 
of the semi-axis being equal to the con- 


stant +, all the ellipses are similar. 


The intersection with the planes y = k 
and z = h yields hyperbolas. Hence we 
obtain a surface which consists of two 
distinct portions (“sheets”) which extend 
to infinity. The surface is depicted in 
Fig. 219. If a = b we obtain a hyperbo- © 
loid of revolution generated by revolving 
a hyperbola about its transverse axis. 

10. Paraboloids. There are also two 
types of paraboloid. An elliptic parabo- 
Fig. 219 loid has the canonical equation | 


Uzma H by (anb z0) 


The section by the plane z = h is a curve represented by the equation © 
az? + by? = h which can be put down as BA i: ya ed 1. Hence, — 


| -ora 
for h >0, we obtain an ellipse with semi-axes VE and E E 


There will be no intersection for h < 0, and we shall have a single 
point (the origin of coordinates) for k = 0. When h increases from 

0 to infinity the sizes of the ellipses tend to infinity, and all the 5 
ellipses are similar because the ratio of their semi-axes assumes 


ee ATS b 
the constant value, Vane Auf . The sections by the 


planes y = h and z = k are parabolas. Hence, we obtain a sur- — 
face which is shown in Fig. 220. The paraboloid has two planes of 
symmetry (namely, the planes z = 0 and y = 0). If a = b we have 
a paraboloid of revolution generated by revolving a parabola about 
its axis. l 
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A hyperbolic paraboloid has the canonical equation 
z= —a + by? (a, b >0) (12) 


Intersecting the surface by the plane z = 0 we obtain the parabola 
z = by? which opens upwards (in the positive direction of the z- 
axis). On the contrary, the sections by the planes y = h are the 
parabolas z = —aa* + bh? which open downwards. The sections 
are shown in Fig. 224. Finally, the sections by the planes z = k 
are hyperbolas. Thus, we obtain a surface having the form of a saddle. 

It can be shown that this surface, like a hyperboloid of one sheet, 
is entirely made up of straight lines forming two families. For 
instance, the tangent plane to the hyperbolic paraboloid (shown 


CA 
Fig. 220 Fig. 221 


in Fig. 224) passing through the origin of coordinates is the plane 
z = 0. But at the same time, if we put z = 0 in equation (12) we 
get Var = +V by which means that the tangent plane intersects 
the surface along two straight lines. 

11. General Review of Algebraic Surfaces of the Second Order. 
The general form of an equation representing a surface of the secon 
order can be written as 


Ag? + 2Bay + Cy? + 2Daz + 2Eyz + Fz? + 
+Gz+Hy+iz+J=0 (13) 


(compare with Sec. II.13). 

We shall show in Sec. X1.12 that we can always perform a rota- 
tion of the original Cartesian axes so that the equation related to 
the new coordinates should no longer contain the terms involving 
the products of different coordinates. Therefore if we denote the new 
coordinates as 2’, y’ and z' the equation will not contain the products 
aly’, ae and y’z’ and thus it will have the form 


aaa p Cy? + Fs + Ga’ + Hy’ HTH =0 (4) 
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where A’, C’, F’, G’, H’, I’ and J’ are some new constant coef- 

s ficients (compare with Sec. I1.13). Depending on the signs of the 
coefficients A’, C’ and F’ we perform the further investigation in 
different ways. Let us first suppose that all these coefficients are 
different from zero and have the same sign. For definiteness, let 
them be positive. Then, as it was done in Sec. II.13, we complete 
the squares and perform a parallel translation of the coordinate 
axes which leads to a new equation of the form 


; A'z”? oe C'y"? ar F'z" Sie Poe 
1.€. 


z"2 


a" y"2 z 
+t 1 
aoe). Chae Magee 


where z”, y” and z” are the new coordinates appearing after the 
parallel translation. It follows that if J’ < 0 we get the canonical 
equation of an ellipsoid, and therefore the original surface (13) 
is also an ellipsoid whose planes of symmetry and centre of symmetry 
are displacediand turned relative to the coordinate planes and the 
origin of coordinates of the original coordinate system z, y, Z. 
But if J’ >0 or J’ =0 we obtain, respectively, an imaginary 
surface or a single point. Similar results are obtained for the case when 
the coefficients A’, C’ and F’ are negative. 

_ We suggest that the reader should verify that when the coeffi- 
cients A’, C’ and F’ are different from zero but have unlike signs 
the corresponding surface will be either a hyperboloid or a cone 
of the second order. We usually call such a cone elliptic although 
its sections, by different planes, can be ellipses, hyperbolas or 
paraolas (see Fig. 86). In particular, there can also be a circular 

one. 

_ If exactly one of the coefficients A’, C’ and F’ entering into equa- 
tion (14) is equal to zero, for instance, if F’ = 0 whereas the corres- 
ponding coefficient J’ is different from zero then the surface will 
be a paraboloid. It can be shown (but we are not going to prove it 
here) that in all other cases there can be only a cylinder of the second 
order, a pair of planes (which may coincide), degeneration into 
a straight line and an imaginary surface. Besides, a cylinder of 
the second order can have a directing curve which is an ellipse 
(a circle in a particular case), a hyperbola or a parabola. According- 
ly, we can have an elliptic (or circular in a particular case), a hyper- 
bolic or a parabolic cylinder. For example, the cylinder described 
in the beginning of Sec. 4 is parabolic. 


CHAPTER XI 


Matrices and Their 
Applications 


§ 1. Matrices 


Matrices were first introduced by the Irish mathematician 
W. Hamilton (1805-1865) and the English mathematician A. Cayley 
(1821-1895). They are widely used now in various branches of mathe- 
matics because their application considerably simplifies the inve- 
stigation of complicated systems of equations. 

1. Definitions. We begin with some formal definitions whose 
advisability will be clarified later. A matrix is a rectangular array 
composed of numbers or some other objects. Unless the contrary 
is stated, we shall only deal with real number matrices, that is ma- 
trices composed of real numbers. For instance, such a matrix can 


have the form 


PARET: AN 1 

2 —1.3 0 5 = 

(; D A of —3 Pe or F or (5) etc. (4) 
aes 3 


Here the parentheses are the sign of matrix. Double vertical bars 
are also used for this purpose (that is the notation of the form 


2 1 340 1 


—3 V2 p ? me etc.) but not the simple vertical bars. 
2-13 3 ; 


which designate a determinant (see § VI.1). Like in the theory of 
determinants, we consider elements, rows and columns of matrices. 
But there is an important difference between a determinant and 
a matrix: a determinant is equal to a certain number (see Sec. VI.1) 
whereas a matrix is regarded as an independent object which is not 
reduced to a simpler object (such as a number and the like). For 
brevity, we can designate a matrix by a single letter, for instance, 
by A, B etc., but then the letter A will nevertheless designate the: 
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whole array of numbers. A matrix can be put down in the general 
form as 


au a ain 
eee a% ly «+s Am (2) 
Ami Amg +++ Amn 


It is convenient to equip the elements of a matrix with two indices 
under the convention that the first index indicates the number of 
the row and the second the number of the column in which the 
element appears. We shall sometimes use the abridged notation 
A = (4;;)mn Which means that i varies from 1 to m and j from 1 to n. 
Every matrix is characterized by its numbers of rows and columns. 
A matrix having m rows and n columns will be referred to as an 
(m x n) matrix. For instance, in formulas (1) and (2) we have, 
respectively, (2 x 3), (8 x 8), (4x4), (1 x 1), and (m x n) 
matrices. If the number of rows coincides with the number of co- 
lumns the matrix is said to be a square matrix. In this case the 
number of its rows and columns is called the order of the matrix. 
A square matrix of the first order is identified with its single ele- 
ment. For instance, the fourth matrix in (4) is simply the number 5. 

A matrix consisting of a single column is called a column matrix 
or a number vector (column-vector). Such a matrix is identified 
with a vector belonging to a Cartesian space of number n-tuples 
{see Sec. VII.18). Thus, the third matrix in (1) is a vector of the 
space H,, the coordinates of the vector being 1, —2, 0, 3. A matrix 
having only one row is called a row matrix. 

A matrix whose all elements are equal to zero is called a zero 
matrix. A square matrix whose all elements are equal to zero pos- 
sibly except those forming its principal diagonal (that is the diago- 
nal connecting the left uppermost element with the tight lower- 
most element) is called a diagonal matrix. If the diagonal. is formed 
by the elements a, b,..., k the diagonal matrix is denoted as 
diag (a, b, ..., k). If all the elements of a diagonal matrix forming 
its principal diagonal are equal to unity the matrix is referred to . 
as a unit matrix. Such a matrix is usually designated by the letter I. 
For example, the unit matrix of the third order is of the form 


10 0 
I=|0 4 0 j=diag (4, 1, 4) (3) 
0 0 1) 


The so-called operation of transposition consists in interchanging 
rows and columns of a matrix with the same indices. Such an ope- 
ration was applied to determinants in Sec. VI.2. If we have a matrix 
A then the transposed matrix (the transpose of A) will be designated 
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as A*. For instance, 


i ay k z) o) 
=| 0 —2 | and 0} =(10 —2) 
ee 3 ays sg 


In the general case we can write ai; = aj; (why?). Obviously, we 
always have (A*)* = A. 

A matrix coinciding with its transpose is called symmetric. Of 
course, only a square matrix can be symmetric. The symmetry 
condition can be put down in the form ai; = @ji- 

If we have ai; = —aji for all the elements of a matrix then the 
matrix is called skew-symmetric (antisymmetric). 

A square matrix A has its determinant which we shall denote by 


det A. For example, det Gs a) = 5 1% 
matrices have determinants; it is impossible to speak about the 
determinant of a rectangular matrix which is not square. It follows 


from Sec. VI.2 that 
det I =1 and det A* = det A 


= —3. Only square 


2. Operations on Matrices. Two matrices with the same numbers 
of rows and columns are added together according to the following 
rule: 

ay Ayo t) us biz A) be bonis ar + dip PATR) 

| boy boz bos) \@aa-+ bzi a22- bee p+ bzs 
The multiplication of a matrix by a number is defined in an analo- 
gous way: 


Qz; Q22 Q23 


ay an 4413) _ (kay ka, ay 
Ga ka» kas 


Ay, ln 43 


We can easily verify that all the axioms of linear operations (see 
Sec. VII.17) hold for the above operations. Hence, the set of all 
matrices of the same size is a linear space. Let us put down the 
following obvious formulas: 


(A +:B)* = A* + BY, (kA)* = kA* and 
det (kC) = k” det C 


where n is the order of the square matrix C. By the way, in the 
general case we have det (A + B) =+ det A + det B. 

Now we are going to introduce the rule of multiplication of a 
matrix by another matrix. The advisability of this peculiar rule 
will be clarified in Sec. 6. First of all, for two given matrices to be 
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multiplied by each other, it is necessary that the number of columns 
of the first matrix factor be equal to the number of the rows of the 
second factor; if otherwise, the multiplication is impossible. This 
condition being fulfilled, the is found according to the rule 


ak: (ae Faba aubt aba anubis + sel 
azb + azb aubiz+ aobo Aoyd4g + Aoabos 


The reader should pay much attention to the structure of the formu- 
la. For instance, to obtain the element of the product belonging to 
the first row and to the third column we must take the first row of 
the first factor and the third column of the second factor and then 
multiply them as if we computed the scalar product of the corres- 
ponding number vector [see formula (VII.12)]. Other elements of 
the matrix which is the product of the two given matrices are also 
obtained by means of a similar operation resembling scalar multi- 
plication of the rows of the first matrix by the columns of the second 
matrix. In the general case when we multiply an (m X n) matrix 
(a:;) by an (n X p) matrix (b;;) we obtain an (m X p) matrix (c;;) 
whose elements are found according to the formula 


n 
Cij = 2 Girbri 
=1 


The above rule implies that we can always mutually multiply 
two square matrices of the nth order which results in a square matrix 
of the same order. In particular, we can always multiply a square 
matrix by itself, that is we can raise it to the second power, but 
this cannot be done with a non-square rectangular matrix. There 
is another important particular case when we multiply a row matrix 
by a column matrix under the condition that they contain the same 


number of elements; this yields a Square matrix of the first order, 
that is a number: 


by 
(a1 az as): | by |=ab + azbz +- Agbg 
bs 


By analogy with Sec. VI.2, we can verify the following proper- 
ties of the product of matrices: i a 


(kA) B = A (kB) =k (AB), (A +B) C = AC + BC, 
C (A + B) = CA + CB, A (BC) = (AB) C 


Of course, in all these formulas we suppose that the numbers of 
rows and columns of matrices entering into these expressions gua- 
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rantee the possibility of the corresponding multiplications. Another 
method of deducing the formulas will be given in Stc. 6. 

The simplest examples indicate that, generally speaking, the mul- 
tiplication of matrices is non-commutative, i.e. AB Æ BA. Let 
the reader verify the following relations: 


(0 ott ob=(o o): G o)l o)=(r 0): 


a ORG (Se 9-(2 29 


/ 


Besides, we have Cz, G) = E) whereas the expression 
(3) BA] makes no sense. The non-commutativity of 


matrix multiplication makes it necessary to keep the order of 
factors. Therefore, to specify the order, we say “to multiply A on 
the right by B” or simply “to multiply A by B” (the operation results 
in AB) but when speaking about the product BA we say “to multiply 
A on the left by B”. 

We also indicate the property 


(AB)* = B*A* (4) 
which can be easily verified, and the property 
det (AB) = det A-det B (5) 


which will be proved in Sec. 7. 

lf A is a complex number matrix the symbol A* designates the 
result of the operation of transposition with the simultaneous repla- 
cement of all the elements by their complex conjugates. In this 
case A* is said to be the transposed conjugate matrix. The above 
formulas will hold for complex matrices if we change two of the 
formulas, namely if we write det A* = (det A)* and (kA)* = k*A*. 

3. Inverse Matrix. Here we shall consider square matrices. For 
definiteness, let us take matrices of the third order. The role of 
unit matrix (3) in the operation of multiplying matrices is analo- 
gous to the role of the number 1 in the operation of multiplying 
numbers. Indeed, we can easily verify that AI = IA = A for any 
matrix A. 

By analogy with number multiplication, we define the notion 
of a matrix A~ which is the inverse of the matrix A: by definition, 
we put 

AA = AAt =I (6) 


From (6) and from equality (5) it follows that 


det A-det (A) =detI=1, that is det (A) = 


~ det A 
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We see that for the inverse matrix to exist it is necessary that det A 
be unequal to zero: det A + 0. A square matrix A for which det A = 
= 0 is called degenerate (singular). Consequently, a degenerate 
matrix has no inverse. At the same time, every non-degenerate 
(non-singular) matrix has its inverse. Actually, let us take an arbi- 
trary non-degenerate matrix : 


a by e1\ 
K=| 4 b a (7) 
ag bs Cs 


Then, bearing in mind the definition of the product of matrices, 
we verify, reasoning as in Sec. VI.4, that the multiplication of K 
on the left ‘or on the right by the matrix 


A, A z) 
1 
Bessel OF PERAN o UY s (8) 
qetk | 7! 2 3 
SAUN CL UO ie, 
yields the matrix I (the capital letters A4, ..., Ca designate the 


corresponding cofactors of the elements of the determinant of the 
matrix K; see Sec. V1.3). Matrix (8) is therefore nothing but the 
matrix K=. 

Inverse matrices can be applied to solving matrix equations. 
For instance, let us consider the equation AX = B where A and 
B are given matrices and X is an unknown matrix. Let us suppose 
that det A = 0. Then multiplying both sides on the left by A~* 
and taking advantage of equalities (6) we deduce X = A-*B. Simi- 
larly, the solution of the equation XA = B is X = BA~ provided 
det A+ 0. 

Matrices enable us to put down a system of equations of the first 
degree in an abridged form of a matrix equation. For example, 
system of equations (VI.5) can be rewritten in the matrix form 


a bh eN- (x d, 
Qa, be Veal tiy Fd 
a3 bg Cy Z ds 


(check it up!). If we designate the coefficient matrix by the letter A, 
the column of unknowns (which is a number vector) by x and the 
column of the constant terms by d the system is put down in a still 
more abridged form as 


Ax =d (9) 
If det A = 0 then (9) implies that the solution is 
x =A-d (10) 
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If we write the formula in full we shall again obtain Cramer’s rule 
deduced in Sec. VI.4. It should be noted that formula (9) also makes 
sense when the number of equations differs from the number of un- 
knowns but in such a case the matrix A will not be square; for such 
a system formula (10) no longer holds because only a square matrix 
has its determinant. 

Equation (9) with a concrete square matrix A and a column d 
composed of letters can be solved according to Gauss’ method (see 
Sec. VI.5). After such a solution we come to formula (40), that is 
we obtain the matrix A~ as the matrix of the coefficients in the 
coordinates of the vector d. This method of constructjng an inverse 
matrix is practically more convenient than the application of 
formula (8) especially when the order of the matrix is large. 

Formula (6) indicates that the matrices A and A~ are mutually 
inverse, that is (A~!)- = A. Besides, we sometimes apply the for- 
mula (AB) = BA- (where det A=40 and det B=4£0) which 
can be readily verified: 


(B-47) (AB) = B~ (AA) B = BIB = B7B = I 

Finally, substituting B = A~% into formula (4) we deduce 

(A-1)*#A* = (AA-1)* = I* = I, that is (A~*)* = (A*)* 

4. Eigenvectors and Eigenvalues of a Matrix. Let A be a given 
square matrix. As we shall see later, we sometimes encounter an 
equation of the form 
Ax = Ax (11) 
where x is an unknown number vector and A is an unknown number, 
the dimension of x being equal to the order of A. Equation (11) 
has the ¢rivial solution x = 0 for any A but we shall be interested 
only in those 4 for which the system has non-trivial solutions. A 
number ) of this kind is called an eigenvalue of the matrix A and 
the corresponding solution x of equation (11) is called an eigenvector 
of the matrix A. 

Eigenvalues and eigenvectors can be found as follows. Since 
x = Ix we can rewrite equation (11) in the form 

(A —M)x=0 (42) 
Comparing formula (12) with formula (9) we see that we have arrived 
at a system of n algebraic homogeneous linear equations in n un- 
knowns where n is the order of the matrix A. According to Sec. VI.6, 
for a non-trivial solution to exist, it is necessary and sufficient 
that the determinant of the system be equal to zero, i.e. 


det (A — AI) = 0 (48), 


This equation is called the characteristic equation of the matrix 
A and it enables us to find the eigenvalues A. For instance, in the- 
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case of matrix (7) the equation has the form 
a—h bi ci 
a, bo —h ĉ& |=0 
ay bs C3-—h 


Writing the determinant in full we see that this is an algebraic 
equation whose degree is equal to the order of the matrix A. By 
Sec. VIIL.8, we conclude that a matrix of order n has n eigen- 
values. Of course, some of them may be complex and some may 
coincide. 

If we consider only real numbers and real vectors then equation 
(11) is satisfied only in the case when we take a real root of the cha- 
racteristic equation (provided there are such). But if we admit com- 
plex numbers then every root of the characteristic equation can be 
substituted into (11). 

After an eigenvalue has been found we can determine the corres- 
ponding eigenvector by solving vector equation (12). For this purpose 
we rewrite the equation in the form of a system of scalar equations 
and apply the methods of Sec. VI.6. Equation (12) implies that, 
for a fixed A, the sum y = x! + x? of particular solutions x! and x? 
is a solution of the same system and that the product z = kx of a 
solution x by a number k is also a solution. Hence, the set of all 
eigenvectors corresponding to a given eigenvalue is a linear sub- 


space (see Sec. VII.18) of the space of all number vectors of dimen- 
sion n. 


The most important case her 
tinct. In this case the s 
A is one-dimensional, 
eigenvalue is defined 


, as it has been mentioned, cha- 


ristic. coefficients can have both real 
and imaginary roots]. The fact that such 


ace cannot have more than n linearly 
ear independence can be proved as 
ooe pond to different eigen- 
are linearly inde s 
x? = ox! + Bx? then multiplying eae 


both sid o ali 
the left by A we obtain Asx? = qax! TEE the: equality on 


i $ ; after that multi- 
plying the first equality by A, and su btracting i ; > 
one we deduce « (Ay — J) xt , racting it from the second 


f € ( + B (Ay — Ay) x2 = 0 hich con- 
tradicts the linear independence of xe and me The baie sata at 
and x? are linearly dependent is treated similarly. 
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If there are coinciding eigenvalues it can be shown that the dimen- 
sion Mp of the linear subspace of eigenvectors corresponding to an 
eigenvalue Àp of multiplicity ną satisfies the inequality m, < nz. 
If we have m} = np for all the eigenvalues we can choose a basis 
in each of the subspaces and form a basis (by combining all the 
bases) in the complex n-dimensional Cartesian subspace Z„ con- 
sisting of eigenvectors of the matrix A of order n. In case all A, 
are real we thus obtain a basis in Æp. But if m, < nr at least for 
one eigenvalue it is impossible to construct a basis consisting of 
eigenvectors of the matrix A. 

5. The Rank of a Matrix. Let us delete several rows and a number 
of columns in an arbitrary matrix A so that the numbers of the 
remaining rows and columns should coincide. Then forming the 
determinant of the remaining square matrix we obtain a so-called 
minor of the matrix A. A matrix can have many minors; some of 
them may equal zero and some may be different from zero. The 
maximal order of minors which are unequal to zero is called the 
rank of the matrix A. This is a very important characteristic of 
a matrix. For example, all the three minors of the second order 
is E $ f F and |; l of the matrix B = E o =) 
are equal to zero whereas there are four minors unequal to zero 
among the six minors of the first order of the matrix. (By definition, 
a determinant of the first order is understood as being equal to its 
single element.) Therefore, rank B = 1. Let the reader verify that 
the ranks of the matrices 


1% SAONA u02 302 
1j2), (4 aA [4-1 3], {60 4] nao 2) 
0,4 ie S6 4-45 906 


(14) 


are, respectively, 2, 3, 2, 1 and 1. The rank of a zero matrix which 
has no minors different from zero is assumed to be equal to zero. 

Evidently, the rank of a square matrix does not exceed its order 
and is equal to the order if and only if the matrix is non-degene- 
rate. The rank of an (m X n) matrix with m = n does not exceed 
the least of the numbers m and n. 

It can be shown that the rank of a matrix is equal to the maximal 
possible number of linearly independent rows in the matrix. We are 
not going to prove this property here. (By the way, the rows of 
a matrix can be regarded as matrices and we can therefore perform 
linear operations on them.) For instance, in the second example (14) 
all the three rows are linearly independent; in the third example 
the first two rows are linearly independent whereas the third equals 


22—0144 


<> 


Te 
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their sum; in the fourth example the second and the third rows are 
linearly expressed in terms of the first one. 

Property 7 in Sec. VI.2 immediately implies that the rank of the 
transposed matrix coincides with that of the original matrix. The rank 
of a matrix is therefore simultaneously equal to the maximal pos- 
sible number of linearly independent columns in the matrix. In 
concrete problems the rank of a matrix can be found by means of 
transformations similar to those described in Sec. VI.3. Let the 
reader think about the order in which the necessary operations should 
be performed. 

The concept of the rank of a matrix makes it possible to state 
the theorems on the solvability of a system of algebraic linear 
equations in the general case when the number of equations may 
not coincide with the number of unknowns. For definiteness, let 
us take a system of three equations in four unknowns of the form 


ac + by + ez + du = fi 
at + boy + cz + dou = fz (15) 
ast + bay + caz + du = fa 

Introducing the number vectors 


a fi 
a=|@), ...,f= fo 
a3 Ts 
we can rewrite the system in the form 
f = za + yb + ze + ud (16) 


Hence the problem is reduced to resolving a given vector f with 
respect to given vectors a, b, c, d. What are the conditions guaran- 
teeing the possibility of such a resolution? For given a, b, c, d, 
all the vectors of the form za + yb + ze + ud with ail the possible 
values of z, y, z, u constitute a linear subspace in Æ, “spanned” 
by a, b, c, d. The dimension of the subspace, by the lemma in 
Sec. VII.19, equals the maximal number % of linearly independent 
vectors among a, b, e, d, that is it equals the rank of the coefficient 
matrix A of system (15). For resolution (16) to be possible, it is 
necessary that the vector f should belong to the subspace. Hence, 
ependent vectors among the vectors 


4 he necessary and i iti 
for the existence of a solution of system (is): fe ge condition 


a b ca d a bi c d fa 
rank | @ bə. c dz |= rank az be cy do fa (17) 
a, bs c3 d} az bs cs ds Ts 
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that is the ranks of the coefficient matrix and of the augmented 
matrix must coincide. The condition guaranteeing the solvability 
of a linear system of arbitrary number of equations containing any 
number of unknown quantities is of the same form. 

Jow let us suppose that condition (17) for the solvability of 
system (45) is fulfilled. Then what is the number of solutions of 
system (15)? Let us designate by To, Yo, Zo, Uo Some concrete parti- 
cular solution of the system and let us introduce new variables 
x’, y’, 2’, u' by means of the relations z = zo + 2°, y= Yo + Iis 
z =Z +7’, u =u +u'. Then we readily verify that 2’, UF se 
u' satisfy the homogeneous linear system 

ax + by! + eye’ + du’ = 0 
aon’ + bey’ + coz’ + dau’ = 0 (18) 
age’ + bay' + cg2’ + dgu’ = 0 


Introduce the following four vectors of Æ: 


ay ay a3 x 
by bz bs ' y' 
= =| > = and x= 
Pi Peal He P2 ral Ps Cs 2! 
dı dz dz wu’ 


Then, by Secs. VII.20-21, system (18) can be rewritten in the form 
“px = 0, pox’ =0, psx = 0 (19) 


Thus, we see that the sought-for vector x’ must be perpendicular 
to the subspace of E,“spanned” by the vectors py, P2, Ps- The dimen- 
sion of the subspace being equal to rank (17), we can readily show 
that the dimension of the linear subspace constituted by the vectors 
x’ is equal to 4 — rank A (in the general case 4 is replaced by the 
corresponding number of unknowns). Therefore, the dimension of 
the set of the solutions of system (19) is the same. Each of the solu- 
tions of (45) can be regarded as a point of E,, and thus the fulfilment 
of condition (17) implies that the set of the solutions of system (15) 
is a hyperplane of dimension 4 — rank A in the space Æ, (see Sec. 
VII.19). 


§ 2. Linear Mappings 
6. Linear Mapping and Its Matrix. Let us hegin with an example. 
Let a plane be turned through an angle a. Then every vector ps 
belonging to the plane will be carried into another vector y which 


we denote as A (2) or, simply, as Az. Hence, y= Az. (For our 
further purposes in § 2, it is convenient to designate geometric 


22% 
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vectors and some other vectors by letters in ordinary type equipped 
with arrows.) In this case A is therefore the sign indicating the rota- 


tion of a vector; to each vector x there corresponds the vector Az. 
In other words, we have defined a mapping A of the plane of vectors 
into itself. The terms a “transformation A” or an “operator A” are 


used synonymously in such a case. A given vector x is called a pre- 
image (an original, or an inverse image) and the vector Az is called 


the image of x under the mapping A. 
A rotation transforming parallelograms into parallelograms, we 
conclude that the addition of preimages yields the addition of 
the corresponding images (see 
Fig. 222). In other words, in 
the case of rotation the image of 
asum is the sum of the images: 


A (a, + 2,)=Az,-+Az,. We can 
similarly verify that the multi- 
plication of a preimage by a 


ae, number yields the multiplication 


Az; 
Az; 


of the image by the same num- 


z ber, i.e. A (Ax) = AAz. Hence, 

a under the mapping in question, 

eae ae K to linear operations on the prei- 

Z= t + ty, AZ = AM + veal mages there correspond analo- 

A(xi + x9) = Axi + Axa gous linear operations on ima- 

ges, that is linear relations bet- 

ween vectors remain valid after the mapping has been performed. 

A > > > > > > 

For instance, if 2, = 2x2, — 5z, then Az, = 2Az, — 5AT, etc. 
This property is called the linearity of the mapping. 

The operation of projecting all the vectors in space on a fixed 
plane or on a straight line possesses the same properties. The verifi- 
cation of the properties is left to the reader. We can take quite 
a different example now. Let us consider the space of all polynomials 
(see Sec. VII.18) and define the image of each of the polynomials as 
its derivative. The linearity of this mapping is implied by the facts 
ae: hes derivative of a sum is equal to the sum of the derivatives 
pre Ba it constant factor can be taken outside the sign of diffe- 

We now proceed to give the general definition of a linear mappin 
Let two linear spaces (R) and (8) (see Sec. VII.17) be iran Eines 


that there is a law, a rule, according to which to every vector z € (R) 


there corresponds a certain vector Dea As 
: e y = Ax € (S). Then we say that 
we are given a mapping A of the space (R) into the space (8). [If 
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(S) = (R) then we say that there is a mapping of the space (R) 
into itself. If every vector y € (S) is the image of a vector £ € (R) 
under a mapping A then we say that A maps (R) onto (S).] A map- 
ping is said to be linear if for any zı, Ta € (R) and for any number 
à we have 

A (+2) = Azi F Ata and A (nz) = KAZ (20) 
Applying these properties several times we readily deduce 


A (hye + gt ts oot Inte) = MAT + Mata ++ 
Dh MA (24) 


Hence, a linear transformation does not change the form of a linear 
combination because the coefficients remain the same, and it is 
only preimages that are replaced by the corresponding images here. 
Hence, not only the sum of preimages goes into the sum of the 
images but also the difference goes into the difference and so on. 
Putting A = 0 in the second equality (20) we deduce the relation 


AO = 0 which holds for all the linear mappings. Here we have, 
of course, the zero vector of the space (S) on the right-hand side 
and the zero vector of the space (R) under the sign of the linear 
mapping A on the left-hand side. 

For definiteness, let us suppose that the space (R) is three-dimen- 
sional and the space (S) is two-dimensional. Let us arbitrarily 
choose a basis pi, Pa» Ps im (R) and a basis qı, qa in (S). Each of 
the vectors Ap; belongs to (5) and it can therefore be resolved with 


a 
respect to the basis qı, 42- Let us introduce the notation 


(22) 


Apt ce asiga T a292 
Ap: = 094 Be a203 
Aps = U3 + a392 


goes into a vector y = Az = yıqı + Yods E (S) according to the 
following rule: à 


y = A (aps + 22Pa + Zaps) = TAP, + at pAps + xsAps = 
= (ty + lta aita) di + (üti + azt + Azgl5) qa 
that is 


Then, by formula (24), any vector T= api + tap) + zaps € (R) 


Yı = Ut + Atg + l13T3 
Yo = yTy + Any + ar3%3 


(28) 
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Thus, we have arrived at formulas which express transformation of 
the coordinates of a vector under a linear mapping. If we denote 


aa a aig Qg) 
(2 =(*), a=] 1 249 p) 
Ye 21 Q22 Q23 
3 
then, by Sec. 2, formulas (23) can be rewritten in the form 
y = Ax (24) 
The number matrix A entering into formula (24) is called the matrix 


of the linear mapping (operator) A relative to the given bases D j 


and q; since it depends not only on the mapping itself but also on 
the choice of the bases. The components of the number vectors x and 


y depend not only on the vectors z and y but also on the bases. 

If we have two bases Pir q: chosen in the spaces (R) and (S) and 
if it is known that any vector z = LD + Tapa + £D € (R) is car- 
ried into a vector y= Az = Wid + Yds €(S) whose coordinates 


Yi. Yə are defined by formulas (23) the mapping y= Az is linear. 

When we add vectors their similar components are also added and 
the corresponding number vectors are added too; but (24) implies 
that when we add number vectors x the corresponding vectors 
y are also added. The second property (20) is verified similarly. 


> 


Hence, if certain bases p; and g; are chosen then to every linear 
mapping of (R) into (S) there corresponds its matrix which is the 
transpose of the coefficient matrix of the expansion of the vectors 


Ap; with respect to the basis q;. Conversely, each matrix with the 
corresponding numbers of rows and columns [a (2 x 3) matrix in 
ae case] is a matrix of a linear mapping of (R) into (S). Evidently, 
r (R) is of dimension n and (S) is of dimension m then the matrix 
ce ate of (R) into (S) is an (m X n) matrix. In particular, 
Gms ) = (R) the matrix of a linear mapping is square. In such 
bei Metis eat expand vectors with respect to the same basis 
TE ana ies the mapping, unless the contrary is stated. 
bis a 0 s matrix A being equal to the maximal number of 
iy x independent columns, formula (22) implies that the 
ank equals the number of linearly independent vectors among the 


vectors Ap;. Thus, the eta equal to the dimension of the linear 
l y the images of all the vectors of (R) under 

e apphig A. The space A (R) can either coincide iE (8) or 
= s i ea subspace of (S) having a lower dimension. As it has 
een already mentioned, in the first case we say that (R) is mapped 
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onto (S). In particular, it follows that although the matrix A of 
a mapping A of (R) into (S) depends on the choice of bases in (R) 
and (S) the rank of the matrix is independent of the choice. Besides, 
the rank of an (m x n) matrix not exceeding n, the dimension of 
A (R) cannot exceed that of (R). Consequently, we see that a di- 
mension cannot increase under a linear mapping (in § 4 we shall 
see that the same is true for a non-linear mapping). 

Instead of considering a mapping of vectors into vectors we can 
also consider a mapping of points into points which is more visual. 
Let us suppose that to each point M of a plane (P) there corresponds 


a certain point M of a plane (P). Then we can say that we are given 
a mapping of the plane (P) into the plane (P). In such a case we 
shall write M = f (M) (compare with the consideration in Sec. IX.9 
concerning this notation). Suppose that the mapping f is such that 
it does not violate rectilinearity, that is suppose that vectors lying 
in the plane (P) are carried into vectors lying in the plane (P) under 
the mapping. In addition, let us suppose that equal vectors in the 
plane (P) go into equal vectors in the plane (P). Then we can say 
that to each vector x of the plane (P) there corresponds a completely 
specified vector y of the plane (P) which is independent of the dis- 


position of the origin of z in (P). Let us denote y as y = Ag. Final- 
ly, let the mapping A be linear. (The example at the beginning of 
Sec. 6 satisfies all the requirements enumerated here. The planes (P) 
and (P) coincide, and f (M) is understood as the result of rotation 
of the point M through the angle œ about the centre of rotation.) 

Now let us choose an arbitrary affine coordinate system (see 
Sec. VII.9) with the coordinates designated as x, £2, with the origin 


of coordinates O and the base vectors Pu Do Then the radius-vector 
is represented in the form r = 2p, + Zp. Let us also choose in 


the plane (P) an arbitrary affine coordinate system yı, Yą with the 


origin O and the base vectors qs ae Denote the coordinates of the 
point f (O) belonging to the plane (P) by bı, by. Let the coordinates 
Zi, z of a point M of the plane (P) be given. What are the coor- 
dinates yı, ya of the corresponding point f (M)? We have 


SS, 5 “ig wy 

GF (M) = OF (0) + F O F (M) = bigi + bag, + A (OM) 

and therefore, by above formulas for transformation of the coordi- 
nates of a vector, we obtain 

Yi = UTi + Ayr, + bı | 


Yo = Ay, + Aggy + ba o>) 
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or, in the matrix notation, 


y=Ax+b (26) 

where A = ond =) is the matrix of the mapping A relative to 
424 üz R 

the chosen bases and b = ( a) . Conversely, we can easily verify 


that if the coordinates of points are transformed according to for- 
mulas (25) the corresponding mapping possesses the properties 
described in the preceding paragraph. The simplest formulas are 
obtained when the origin of coordinates in the plane (P) is carried 
into the origin of coordinates in the plane (P) under the mapping 
in question. We have b, = b, = 0 in such a case, and therefore 
formulas (25) turn into 

Yr = lty + Tg 

Yo = lyt, + Appt, 
i.e. y = Ax. The coordinates of vectors are also transformed accor- 
ding to formulas (27) in the general case (25). 

If we consider the geometric space or an abstract Cartesian space 
of any dimension (see Sec. VII.18) we arrive at formulas similar 
to (25) but with different numbers of rows and columns. The matrix 
form of writing will have form (26) again but in the general case the 
rectangular matrix A may not be square since we can have a mapping 
of one space into another when their dimensions are unequal. 

Let us suppose now that the dimensions are equal. For simplicity’s 
sake, let us again consider a mapping of a plane (P) into a plane (P). 
If det A + 0 the mapping is called affine. In this case we can mul- 
tiply equality (26) on the left by A~ which results in x = Ay — 
— Ab. Consequently, we obtain an equality of the same form (26). 
The inverse mapping of the plane (P) into the plane (P) is therefore 
also affine. 

In Fig. 223 we illustrate the most important types of affine 
mapping of a plane onto itself for which the origin of coordinates 
Temains at the same place. Formulas for the transformations of 
coordinates and the corresponding matrices Telative to a Cartesian 
coordinate system are also put down in Fig. 223. (Let the reader 
prove the formulas in the third example taking advantage of the for- 
mulas y, = p cos (p + a) andy, = p sin (p + a) where p = OM = 
= OM.) Of course, we can also consider different combinations of 
these simple mappings and additional parallel translations as well. 

If det A = 0 then the rank of the matrix is equal either to 1 or 
to 0. As it was shown above, in the first case the plane (P) is mapped 
onto a straight line (in particular, we have such a case when we 
consider the operation of projecting). and in the second case the 
plane goes into a point of the plane (P). 


(27) 
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By analogy with Sec. II.7, we can readily prove that the order 
of an algebraic curve does not change under an affine mapping. In 
particular, the straight lines being the only curves of the first order 
(see Sec. II.9), an affine mapping transforms straight lines into 


Fig. 223 
Affme mappings of a plane: 
(a) f-fold stretching along the x,-axis (b) k-fold stretching in all directions 
yı = k4 ia ‘) yi = hay E a 
v= x2 014 y2 = kx 0 k 
(c) Rotation through the angle @ 
yi x4 Cos œ — Xa sina bakers 
y2 = x4 sin a + X2 COS & sing cosa 
(d) Shear along the xyaxis (e) Reflection in the x,-axis 
yi = x1 + Axe (l R yi =% (i, ‘) 
ya = x2 014 y= %2 01i 


straight lines. Since the point of intersection of two lines must be 
carried into the point of intersection of their images under the map- 
ping, intersecting lines go into intersecting lines, and consequently 
parallel straight lines go into parallel lines. On the basis of the de- 
finition we can verify that the ratio of two parallel line segments 
does not change under an affine mapping; at the same time, the 
ratio of non-parallel segments, angles and lengths are changed in 
the general case. Curves of the second order are carried into curves 
of the second order under an affine mapping. An ellipse being the 
only finite curve of the second order, an affine mapping transforms 
an ellipse into an ellipse (or into a circle in a particular case). A para- 
bola is an infinite curve of the second order consisting of one compo- 
nent and it is therefore mapped onto a parabola. Finally, a hyper- 
bola is mapped onto a hyperbola. 

Let us return to linear mappings of general linear spaces (see 
the beginning of this section). Let two such mappings A and B of 
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a space (R) into a space (S) be given. Then we can define the sum of 


the mappings by means of the formula (A + B) z= Ar -+ Bz. We 
similarly define the multiplication of a mapping by a number: 


(AA) z = à (Az). We readily verify that if some bases in (R) and 
(S) are chosen and if the above operations are performed on mappings 
then the same operations are performed on the matrices of the map- 
pings. Besides, all the axioms of linear operations hold here. The 
role of the zero mapping is played by the mapping of the whole space 
(R) into the zero vector of the space (S). 

The multiplication of mappings is defined as their successive 
performance. More exactly, suppose we have a mapping B of a space 
(R) into a space (S) and a mapping A of the space (S) into a space 
(7). Then AB is understood as a “composite” mapping of (R) into 
(T) which is obtained if we first perform the mapping of (R) into 


(S) and then perform the mapping of (S) into (7), that is (AB) z= 


= A (Bz). If some bases are chosen in (R), (S) and (7) the multipli- 
cation of the mappings yields the multiplication of their matrices 
which accounts for the rule of multiplication of matrices given in 
Sec. 2. The rule is extremely important and we shall therefore illu- 
strate what has been said by taking an example of matrices of the 


second order. Let the mappings B and A be represented by the cor- 
responding formulas 


= 


Yı = Dye, + el and Z 
Yo = byt, + boats Z3 = AY + aoyo 
where £j, yx, 2; (j, k, i = 1, 2) are the coordinates in the spaces under 


consideration. Then in order to obtain the “c ite” i 
0 omposite” mapping we 
must substitute the first formulas into the second which results in 


Z3 = y (buz, + biata) + ig (boiz, + baot) = 
= (tub + tibai) 2, + (tbis + aiban) Ly = C11, F Cito, 
ai Za = y (buzti + biota) + asa (bati + bazata) = 
= (anbu + aagba1) 24 + (tabia + gabo0) Ly = Cyt, + CaTa 
Thus, we have arrived at the matrix C = (a “12\ formed accor- 
C: 


ding to the rule indicated in S owe. 
; ec. 2 which sł 1 = 
In connection with the relationshi CEAT y irlara 
mappings and the corresponding ope: 


= ayy, + aise} 


ity. Here we shall indicate 
only the property A (BC) = (AB) C which js implied by the corres- 


i ponding property of mappings A (BC) = (AB) C, the latter being 
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justified by the fact that both on the left-hand side and on the right- 
hand side we have the mappings which are obtained if we first perform 
the mapping C and then the mappings B and A in succession. When 
we consider the mappings of a space into itself, that is when (S) = 
= (R), the role of the unity in the operation of multiplication is 
played by the identity mapping (unit mapping) under which every 


Ais 
vector is carried into itself: Ic = z. It is clear that we always have 
Al — IA = A. The matrix of the unit mapping is the unit matrix I 
relative to any basis chosen in the space in question. Similarly, the 
inverse matrix corresponds to the inverse mapping. We saw that the 
multiplication of matrices is non-commutative in the general case. 
This is obviously accounted for by the fact that when we reverse 
the order in which mappings are performed the result can be changed 
considerably. (For instance, let the reader verify that if we first 
apply the mapping shown in Fig. 223a for k = 2 to the point (0, 4) 
and then apply the mapping c for a = 90° this will result in the 
point (2, 0). But if we reverse the order of the operations we shall 
obtain the point (1, 0).) 

Generally, two given operations, actions, are called commuting 
if the result of their successive application does not depend on the 
order they are performed, and they are called non-commuting if 
otherwise. (Think whether the following two “operations” commute: 
(a) filling a swimming-pool with water; (b) diving into the swim- 
ming-pool.) 

7. Transformation of the Matrix of a Linear Mapping When the 
Basis Is Changed. It has been already noted that when we consider 
a linear mapping A of a space (R) into a space (S) the matrix of 
the mapping depends on the choice of the bases in both spaces. One 
and the same mapping can have a more complicated matrix relative 
to one basis and a simpler matrix relative to another basis. Let 
us investigate the relationship between the matrix of a mapping 
and the bases we choose. For this purpose let us return, for defini- 
teness, to the example in which we deduced formula (24). Let a new 


basis Pin Ph Pad be chosen in (R). Then any vector z can be resolved 
both with respect to the new basis and to the old one: 

z = tpi ais Tapa ar Tps E zp, ae zp, T zp, (28) 
where Z, £a Za are the old coordinates and 2;, £, £, are the new 
coordinates of the vector 7T. Each of the new base vectors can be 
expanded with respect to the old basis: 

pi E hupi ae hap» a haps 
Da = hapı + Rasa + Naas | (29) 
mA = ħispi + hesPa + hssP3 | 
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where hj; are some coefficients which define the transformation 
from the old basis to the new basis. Substituting formulas (29) 
into (28) and equalling the coefficients in the same basis vectors 


>_> > 


Ps, Po: Ps We obtain 


t= ht, i hiat, =a hist, 
Ta = hot, + hast, + hasz | (30) 


Ly = hgt, + hgt, + h3s2; 
(let the reader verify the calculations!). It should be noted that 


a matrix of type 
hy he hag 
a (= hoz ra) 
hası hs) Ras 


that is a transformation matrix from new coordinates to old coordinates 
(from zi, T}, T, to Zi, To, £3 in our concrete example) must. necessa- 
rily be non-degenerate. Indeed, when the coordinates £i, Za, Ta 
are given we must obtain certain completely specified values of 
Ti ©, x, and system of equations (30) must therefore be compatible 
and must have a unique solution. Therefore, det H Æ 0. 

_ Formulas (30) can be put down by analogy with formulas (23) 
in the abridged form 


Tı t 
=Hx' wh f 3 
x=Hx’ where x=| 2 and, x’=| T, 


T3 T; 


Similarly, if we introduce a new basis qs % in the space (S) and if 
the transformation matrix from the new coordinates to the old 
ones is K then y = Ky’. Substituting these formulas into (24) 


we obtain Ky’= AHx’. Multiplying the 1 i 
ale ale A plying the last relation on the left 


yY = K""AHx’ ie. y’ = A’x’ where A’ = KAH 


The matrix A’ = KAH is nothing but i i 
1 I the 
in question relative to the new fee, or 


In particular, if (S) = (R) then K = H and therefore 
A’ = HAH (31) 


Let us consider the geometric meanin i 
t y g of the determinant of the 
aan: p an affine mapping of a plane onto itself. Let the mapping 
be define by formulas (25) and let us first suppose that the basis 
Pt» Pa taken in the plane (P) is a Cartesian basis. According to our 
condition we have (P) = (P) here. The coordinates of vectors being 
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transformed according to formulas (27), the vectors Pi» Pa will be 


carried into the vectors Sı = upi + Po S2 = Mapi + AgsPa 
respectively (why?). The area of the square constructed on the vec- 


tors pı, pa equals unity whereas the area of the parallelogram which 
is the image of the square under the mapping, that is of the paralle- 
logram constructed on the vectors sı, S2, is equal to | ae a 
2 2 
— | det A |, according to the end of Sec. VII.13. Now remark that 
all the parts of a plane are changed in a similar fashion under an 
affine mapping and therefore the areas of all geometric figures change 
proportionally with the same factor of proportionality. Hence, 
| det A | is equal to the factor of proportionality defining the change 
of the areas under the mapping in question. The sign of det A also 
has a certain geometric meaning. Namely, if the determinant is posi- 
tive the direction of describing the contour of a figure is retained 
under the mapping, that is if we describe the contour of a preimage 
in the positive direction the contour of the image is also described 
in the positive direction. If det A < 0 then the direction of des- 
cribing a contour is replaced by the opposite direction under the 
mapping. (Let the reader verify all the assertions for examples in 
Fig. 223.) 
If now we pass to an arbitrary new basis we shall have 


det A’ = det (HAH) = det (H~) det A det H = 
= (det H) det A det H = det A 


and thus we see that although the matrix of an affine mapping depends 
on the choice of the basis with respect to which it is considered the 
determinant of the matrix is independent of the choice, that is its 
geometric meaning is the same for all possible choices of the basis. 

If we consider an affine mapping of one plane onto another then 
| det A | is also equal to the coefficient of proportionality defining 
the change of the areas if we measure the areas on each of the planes 
relative to the areas of the corresponding parallelograms constructed 
on the basis vectors. 

In the case of an affine mapping of the geometric space onto itself 
we can analogously show that | det A | equals the proportionality 
factor defining the change of the volumes. In this case det A has the 
sign + or — depending on whether the right-handed triads of vectors 
remain right-handed or turn into the left-handed triads under the 
mapping in question. The geometric meaning of the determinant 
of the matrix of an affine mapping of a Cartesian space of any di- 
mension onto itself can be interpreted in a similar way. In particu- 
lar, this meaning immediately implies formula (5) because when 
we perform two affine mappings in succession the corresponding 
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factors of proportionality defining the changes of the volumes are 
mutually multiplied. 

8. The Matrix of a Mapping Relative to the Basis Consisting of 
Its Eigenvectors. Let us consider a linear mapping A of a linear space 


(R) into itself. If a nonzero vector z goes into a parallel vector under 
the mapping, that is if the mapping of the vector x reduces to the 


multiplication by a scalar (Ax = Az) then z is called an eigenvector 
of the mapping A corresponding to the eigenvalue 1. 

For instance, any vector parallel to the z,-axis is an eigenvector 
corresponding to the eigenvalue k of the mapping shown in ‘he 
first example in Fig. 223. In this example any vector parallel io 
the z-axis is also an eigenvector; it corresponds to the eigenvalue 1, 
that is it does not change under the mapping. Let the reader find 
the eigenvectors and the eigenvalues for the other examples in 
Fig. 223. 


If we have chosen a basis in (R) we can consider the number vector 


x consisting of the coordinates of a vector z instead of the vector x 
itself. Then, according to Sec. 6, the equality defining an eigenvec- 
tor acquires the form Ax = Ax, i.e. form (11). Hence, by Sec. 4, 
the vector x must be an eigenvector of the matrix A of the mapping 
in question relative to the chosen basis. In Sec. 4 we established 
the method of finding these vectors. On the basis of Sec. 4, we con- 
clude that the number of the eigenvalues of the mapping A coin- 
cides with the dimension of the space (R) but there can be imaginary 
values and coinciding values among them. For instance, in the 
example (c) in Fig. 223 all the eigenvalues are imaginary (check 
it up!) and therefore none of the nonzero vectors remains parallel 
to its original direction under the mapping. (By the way, Sec. VIII.8 
implies that if the dimension of (R) is odd then equation (13) posses- 
ses at least one real root and there is therefore at least one eigen- 
vector.) 

j For definiteness, let us suppose that the space (R) is three-dimen- 
sional. Besides, let us suppose that there are three linearly inde- 
pendent (real) eigenvectors of the mapping A (let these vectors be 


TUE l; and the eigenvalues be M, As, As). Let us take these vectors 
as a basis in (R). It turns out that in such a case the matrix A 
acquires a form which is especially simple. Indeed, let us write 


ee Prey % 
at, F ait, + VAEA l 


= 
lll 


A,X T Ao, TUT Ay 5X, 
Ys = 45,2, + lyt, + ayt, 
where the numbers ir (j, k = 1, 2, 3) are the elements of the matrix 
relative to the basis which are yet unknown and zi, 2}, x, are the 


coordinates in this basis. The vector i, 


(32) 


has the coordinates (projec- 
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tions) z; = 1, z, = 0, z} = 0. After the mapping it goes into the 
vector Ml, with the coordinates y; = M, y, = 0, y, = 0. We must 
therefore have 

M = a1 + a0 + a0, 

0 = a1 + a0 + a,,0, 

0 = apl + a0 + 43,0 
from which we find aj, = M, a„ = 0 and a, = 0. We similarly 
obtain a = Ag, Gis = Ag, Qi = Qg = Qo = Ay, =0 (check it 
up!). Therefore formulas (32) in fact have the form 


Yi = MT Y, = Aty Ys = Ast; 


Consequently, the matriz of a linear mapping relative to the basis 
consisting of its eigenvectors has the diagonal form 


ae 
AS 10 do 0 = diag (A; hey As) 
OTON Ag 


By Sec. 4, for such a basis to exist, it is sufficient that all the 
roots of the characteristic equation of the matrix A be real and 
distinct. Any matrix can be regarded as the matrix of a linear map- 
ping, and the matrix of a mapping is transformed according to for- 
mula (34) when the basis is changed. Hence, the above result can 
also be formulated as follows: if all the roots of the characteristic 
equation of a square matrix A are real and distinct it is possible to 
find a non-degenerate matrix H such that the matrix HAH will be 
diagonal and the diagonal elements will be equal to the roots. 

If the characteristic equation of a matrix has an imaginary root 
we can find the corresponding eigenvector (number vector) by sol- 
ving equations (12), and the coordinates of the vector will also be 
imaginary. Such an eigenvector has no geometric meaning. For 
instance, this is the case for the third example in Fig. 223. But 
of course we can use such number vectors without considering their 
geometric meaning. If we admit complex values of the projections 
in question then all the calculations remain true. In particular, the 
assertion of the preceding paragraph will remain true for any square 
matrix whose all eigenvalues (i.e. the roots of the characteristic 
equation) are distinct. But of course in the general case the matrix 
H may be complex. 

If the characteristic equation of a matrix has multiple roots such 
a matrix cannot be reduced to the diagonal form in the general 
case. In particular, this is the case for the fourth example in Fig. 223. 

In conclusion, let us note that although a matrix A is transformed 
according to formula (31) when the basis is changed its characteristic 
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equation does not change.',Indeed, we have 
det (A’ — AI) = det (H*AH — AJ) = det [H7 (A— AI) H] = 


= det (H=) det (A — Al) det H =~ det (A — AI) det H = det (A — MI) 


det H 
(33) 
which is what we set out to prove. 

9. Transforming Cartesian Basis. In this section we shall suppose 
that (R) is not only a linear space but also a Euclidean space (see 
Secs. VII.20-21). Thus we can speak about Cartesian (Euclidean) 
bases in (R). Let us investigate the properties of a matrix which 
defines the transformation from a Cartesian basis to another Cartesian 
basis. For definiteness, let us consider the space (R) to be three- 
dimensional. In order to investigate the transformation we shall 


use formulas (29). Let Pis pe Ps form a Cartesian basis, that is let 


them play the same role as vectors i, j, k in § VII.3. For Dis Ps p 


also to form a Cartesian basis it is necessary and sufficient that there 
should be 


> = -> > >, > >, > 


PiPi = PaPa = PsP, = 1 and pipa = P,P, = Pi P = 0 
(why?). Putting down the scalar products in full according to formu 
la (VII.12) we obtain 

his + his + iy = hia + ha, + h3, = hi, + hd, + hh = 1, 
hishiz h hashes a Rgihgs A hazħss T hagho ete Nhgahss = 
= hyshys + hahas + hsıhss = 0 
These six equalities can be put down in the matrix form 


(hu ha ha\ (hyp he hy 10 0 
[2 hoe hsz |-| ha hə Ig |={0 1 0 
his Rog has) \Rgr hsz hss 00 1 


(Let the reader check up the last relation by multiplying the matrices 
entering into the left-hand side according to the rules of Sec. 2 
ye ay eed ew tiled the right-hand side.) Using the 
otation introduced in Secs. 1 and 3 i - 
n te es we can rewrite the above con 
H*H =I, ie. H* = H- (34) 

A matrix satisfying these equalities, 
to its transpose, is called orthogon 
above can be stated as follows: the matrix of a transformation from 
Aeh Cartesian coordinate system to another Cartesian system is 
orthogonal. We can similarly show that, conversely, if the trans- 
formation matrix is orthogonal then any Cartesian basis is necessa- 


that is a matrix which is equal 
al. Hence, the property proved 
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rily transformed into a Cartesian basis. [Verify that the matrices 
of transformations (II.3) and (II.4) are orthogonal.) 

Orthogonal matrices are also encountered in connection with 
mappings. Namely, a linear mapping of a Euclidean space into 
itself is called orthogonal if the lengths of the vectors are not changed 
under the mapping (such mappings are also called isometric). Any 
triangle being carried into an equal triangle under such a mapping 
(according to the well-known test for the equality of triangles), we 
see that all the angles are retained in this case. An orthogonal map- 
ping can either be a motion of the space as a whole or a combination 
of a motion and a reflection (in a hyperplane). 

For instance, the third and the fifth mappings among those shown 
in Fig. 223 are orthogonal (the former defines a motion and the 
latter a reflection). 

3y analogy with formulas (22) and by arguments similar to those 
al the beginning of this section we can easily verify that the matrix 
A of an orthogonal mapping relative to a Cartesian basis is an ortho- 
gonal matrix. Conyersely, if a mapping has an orthogonal matrix 
relative to a Cartesian basis the mapping is orthogonal. 

Equality A*A = I [see formula (34)] implies that 


det A*-det A = (det A)? = det I = 1 


from which it follows that det A = +1. This is also implied by 
the geometric meaning of the determinant of the matrix of a linear 
mapping (see Sec. 7). The determinant equals 1 for a motion and 
—1 for a reflection or for a combination of a reflection and a motion. 

Let us note a consequence which is used in mechanics. Let there 
be given a motion of the geometric space for which the origin o! 
a coordinate system remains at the same place. Such a motion can 
be regarded as an orthogonal mapping A of the totality of all the 
vectors of a three-dimensional space. If we factor the left-hand side 
of the characteristic equation according to formula (VIII.25) in the 


form 
det (A — AI) = —(A — My) (A — Ag) (A — Az) 


and then put à = 0 in this identity we receive AyAoA3 = 1. It follows 
that at least one of the eigenvalues A, is real and positive. Hence. 


there exists a vector 2) 0 for which Azo = Arzo. But the lengths 
being retained, we must have A, = 1. Consequently, the motion 
in question is a rotation about an axis passing through the origin 
of coordinates and parallel to an eigenvector corresponding to an 
eigenvalue 1. 

10. Symmetric Matrices. Here we shall indicate the application 
of the above results to investigating symmetric matrices (see Sec. 1). 

It is possible to prove the following properties of the symmetric 
matrices. 


23—0141 
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All the eigenvalues of a symmetric matrix are real. s 
For example, a symmetric matrix of the second order is of the 


form 

'a b 

35 

l ‘) (35) 

and its characteristic equation is 
a—i b 0 
b eae ee 
or 
M—(a+c)A+ac—v?=0 (36) 


The roots of the equation are 
ae ta igure... 
ay Geray Ee e 


and they are obviously real. Here we shall not give the proof of 
this assertion and of the two following assertions in the general case 
for matrices of order higher than the second. 

Eigenvectors of a symmetric matrix corresponding to different eigen- 
values are necessarily orthogonal to each other. 

As an example, let us take matrix (35) for b =+ 0. The coordinates 
of an eigenvector are found from system (12) which has the form 


(a — À) z; + bzr, = 0 
ba, + (c — A) zp = 0 


in our case. If À is an eigenvalue these two equations are dependent 
(see Sec. V1.6) because the determinant 


a—À b 
b c—h 


M,2= 


equals Zero. Hence, we can limit ourselves to solving only one of 
the equations. For definiteness, let us take the first equation. In 


order to satisfy the equation we can put x, = —b and tj =a—hk. 
pene À = hy and A = A, we thus obtain two eigenvectors of the 
orm 
0 —b 
lena) (023) o 


The scalar product of the vectors is equal to 
BP + (a — Mm) (a — do) = Mhz — a (M + A) Hb + a = 
=ac—B—a(atc)+B+a2=0 


[in deducing the result we have taken the advantage of the well- 
known formulas for the sum and for the product of the roots of 
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quadratic equation (36)]. This implies the perpendicularity of vec- 
tors (38). If we put 6 =O the vectors (3) and ee can serve 


as eigenvectors (verify this!) and thus we see that in this case they 
are also mutually perpendicular. 

If not all the roots of the characteristic equation are distinct, 
that is if there is at least one multiple root (let the multiplicity of 
the root be &), it is possible to find k mutually orthogonal eigenvec- 
tors corresponding to this eigenvalue. 

For example, formula (37) implies that matrix (35) possesses a 
double eigenvalue if and only if a =c and b = 0. But then all 
the vectors are eigenvectors (check it up!) and thus we can choose 
two mutually perpendicular vectors among them. 

The above properties imply that if a given symmetric matrix A 
is regarded as the matrix of a linear mapping relative to a Cartesian 
basis then we can always find a new Cartesian basis which entirely 
consists of eigenvectors of the matrix A. For instance, in the three- 
dimensional case the characteristic equation is of degree three 
and thus it has three roots which, as it has been indicated, 
will be real. If these roots are distinct from each other the corres- 
ponding eigenvectors are mutually perpendicular. We can choose 
these vectors so that they should be of unit length, and then they 
can be taken as the sought-for basis. If Ay = Az =Æ Às we can choose 
two mutually perpendicular eigenvectors corresponding to the 
eigenvalue M, and the vector corresponding to the eigenvalue As 
will be perpendicular to both vectors. Finally, if all the three eigen- 
values are the same we can indicate three mutually perpendicular 
eigenvectors corresponding to the eigenvalue. 

The transformation from one Cartesian basis to another is per- 
formed by means of an orthogonal matrix (see Sec. 9). Besides, if 
the latter basis consists of eigenvectors of a matrix it takes the 
diagonal form after being transformed according to formula (31) 
(see Sec. 8). Consequently, the property proved in the preceding 
paragraph can be formulated in terms of matrices as follows: for 
any symmetric matrix A, it is possible to find an orthogonal matrix 
H such that the matrix H-!AH is diagonal and the diago- 
nal elements are equal to the eigenvalues of the matrix A. 


§ 3. Quadratic Forms 


11. Quadratic Forms. A quadratic form in several variables is 
a homogeneous polynomial of the second degree in these variables. 
For example, a quadratic form in the three variables 2, 22, x3 has 
the general form 


F = aa? + aot} + agg + 212123 + 2aygxy73 + 2az3t2£3 (39) 
; 23* 
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where aj Q22, .- +; 3 are numerical ‘coefficients [some of the 
coefficients are doubled in (39) to simplify further formulas]. The 
symmetric matrix 


A=[ 42 az az 
Aig Arg 33 


is called the matrix of the quadratic form. With the help of the 
matrix we can rewrite formula (39) as 


F = (aut + aTa + A1323) Ly + (A224 + A2283 + A2313) T2 + 
+ (44321 + logta + Agg%g) La = Y1L1 + Yo + Yst = 


Yi 
= (x1 tz T3) | Y2 
Y3 
where 
Yı auti + liT -+ AygXs a Ayn Ag\ (a4 a 
i) 4204+ A2222 F Aog%g | =| a12 az də |-| z2 | =A 22 
Ys @1gX1 + Ayg%o F AggX3 Qiz Q23 Q33 T3 T3 
Ti 
Thus, if we introduce the number vector x = (z) we obtain 
F = x*Ax (40) 


Conversely, if a form is represented as (40) and if the matrix A 
is symmetric then A is the matrix of this quadratic form. 

Let us now perform an arbitrary linear transformation of the 
variables of form (30). The transformation can be put down in the 
matrix form as 

x = Hx’ (41) 


Then, by formula (4), we have x* = x’*H* which implies 
F = x'*H*AHx’ = x’* (H*AH) x’ 
that is 
F =x'*A’x’ where A’ = H*AH (42) 
But the matrix A’ is symmetric because, by formula (4), we have 
AN = (H*AH)* = H*A*H** = H*AH = A’ 
Hence, it is A’ that is the matrix of the quadratic form after the 
change of variables is made. 


Thus, substitution (41) yields the transformation of the matrix 
of a quadratic form according to formula (42). In particular, if H 
is an orthogonal matrix then, by formula (84), we see that A’ = 
= HAH. As it has been shown (see Sec. 10), we can always choose 
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a matrix H such that there should be A’ = diag (Mi, Xa, A3) where 
the diagonal elements are the eigenvalues of the matrix A. But then 
the quadratic form will acquire the diagonal form 


Feat dam? FE Ata (43) 


in the new variables. Consequently, any quadratic form (39) can be 
reduced to diagonal form (43) (where M, ho, Às are the eigenvalues 
of the matrix A) by means of transformation (30) with an orthogo- 
nal matrix H. 

The above formal transformation has the following geometric 
meaning. Let us regard A as the matrix of a linear mapping A rela- 
tive to a Cartesian basis with the coordinates zı, z2, £3. Then trans- 
formation (41) reducing the quadratic form F to form (43) corres- 
ponds to the transformation to a new basis consisting of eigenvectors 
of the mapping A. 

In Sec. 8 [see formula (33)] we showed that any transformation 
of form (31) does not change the determinant det (A — AI). Hence, 
if we expand the determinant in powers of à the coefficients in the 
powers will not change; they will be invariant with respect to any 
transformation of a Cartesian coordinate system to another one. 
For instance, a quadratic form in two variables is expressed by the 


formula 


Ax? + 2Bry + Cy’ 


(we have put down the formula using the notation applied in analy- 
tic geometry), that is its matrix is of the form 

iA B 

sc 


and its characteristic equation is written as 
A—à B 
B C—h 


Hence, the expressions A + C and AC — B? are invariant with 
respect to any change of Cartesian coordinates (see Sec. TI.13). 

12. Simplification of Equations of Second-Order Curves and 
Surfaces. The transformation of a quadratic form described in 
Sec. 44 is applied, in particular, to simplifying equations of curves 
and surfaces of the second order. Let us dwell on equations of sur- 
faces since the problem of simplifying equations of curves of the 
second order was considered in Sec. 11.13. 

Let the equation of a second-order surface be represented in ordi- 
nary form (X.13) used in analytic geometry. The transformation 
to a new Cartesian coordinate system having the same origin is 


= (A-+C) 44 AC—B*=0 
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reduced to a change of variables of the form 
z = hye" + hyay’ + hy 32" 
y = Age’ + hazy’ + haz" (44) 
Z = haz’ + hoy’ + hgg 


Il 


as it was shown in Sec. 9, where H = (h;;) is an orthogonal trans- 
formation matrix. (It is evident that if the origin of coordinates is 
left unchanged the coordinates of points and the coordinates of 
vectors are transformed according to the same formulas.) Substi- 
tuting these expressions into equation (X.13) we see that the groups 
of summands containing the terms of the first degree and of the 
second degree are transformed independently. Let us consider the 
transformation of the group of the second-order terms which is 
a quadratic form. On the basis of Sec. 9; we conclude that we can 
always choose a coordinate system z’, y’, 2’ so that this group of 
terms should acquire the diagonal form 


Aya’? + Noy’? + Aaz’? 
Hence, the whole equation (X.13) will have the form 


Ayal? + dey’? + Aag? + G's + Hy HI =o (45) 
where A, As, As are the roots of the equation 


A—k B D 
B CER Teie 
D E F—} 
and G', H’, I’ are some new coefficients in the terms of the first 
degree which occur after substitution (44) has been made. Equation 


(45) is nothing but equation (X.14) put down in the different nota- 
tion. It was investigated in Sec, X.14. 


§ 4. Non-Linear Mappings 


; 13. General Notions. Let us begin with a mapping of a plane 
into a plane. Suppose that there are two planes (P) and (P) (by the 
way, the planes may coincide). Let to each point M of the plane 
(P) (or to each point M taken from a domain in the plane) there 
correspond a point M of the plane (P), according to a certain law. 
Then we say that we are given a mapping of the plane (P) (or of: 
its domain) into the plane (P). For the mappings of a specific class, 
curves go into curves and geometric figures into geometric figu- 
res under a mapping of the plane (P) into the plane (P) although 
the form of a geometric figure may change considerably (see Sec. 
VIII.11). Such a mapping is depicted in Fig. 224, and we see that 
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knowing a preimage in the plane (P) it is difficult to recognize its 
image in the plane (P) and vice versa. There are also the cases of 
degeneration when some geometric figures are “contracted” into curves 
and even into points. 

It is sometimes necessary to consider the inverse mapping which 
can be obtained if we arbitrarily choose images in (P) and find the 
corresponding preimages in (P). As in Sec. 1.21, where we conside- 
red inverse functions, it can happen that we encounter a difficulty, 
namely the fact that the inverse of a single-valued mapping may 


Fig. 224 


not be single-valued. This will be the case if two distinct points in | 
the plane (P) (for instance, such points as the points U and V in 
Fig. 224) are carried into the same point of the plane (P). Such a 
point will have at least two preimages. 

If not only the mapping in question but-also its inverse mapping 
are single-valued we say that there is a one-to-one mapping. If 
the mapping under consideration is not one-to-one but does 
not degenerate the plane (P) can be broken into parts such 
that the mapping is one-to-one in each of the parts. 

Mappings can be described analytically by means of coordinate 
systems. To do this suppose that there is a Cartesian coordinate 


system z, y in the plane (P) and a system x, y in the plane (P). 
These systems may also be coincident. Then if we set the coordinates 
x, y of a point M the coordinates x, y of the corresponding point M 
will be completely specified. In other words, the mapping is defined 


by some relationships of the form 
z= x (z, y) y s y (2, y) (46) 


When considering the inverse mapping we set the values of x and 


y in these formulas and find z and y. For the mapping to be one- 
to-one, it is necessary that there should be no more than one solution 


z, y of equations (46) for any given 7 and y. 


360 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


Similarly, a mapping of a three-dimensional space into another 
three-dimensional space is defined by equations of the form 


r= x (2, Y, 2), y= y (z, Y, 2), es F(z, Y, 2) (47) 


in place of (46). 
We can also consider mappings for spaces of different dimensions. 
For instance, the formulas 


EE y), Y= 9, y), Zs Ue, y) 


define a mapping of a plane into a three-dimensional space. 

Formulas (X.5) can also be regarded as formulas defining a map- 
ping of an m-dimensional space with the coordinates ż, t,, .... tin 
into an n-dimensional space with the coordinates x,, £o, .... Zn. 
Of course, the coordinate systems may not be Cartesian in the gene- 
ral case. 

14. Non-Linear Mapping in the Small. Let us consider a mapping 
defined by formulas (46) in the vicinity of a point Mo (xo, Yo) which 
is mapped into a point Mo (xo, yo). The increment of any function 
being close to its differential to within the terms of higher order 
of smallness (see Sec. IX.11), we can neglect these terms and put down 


a= (E) ae (E) a m 
SG (2), ars (3) ay 


Here Ar = 2 — x, Ax = x — Zo etc. (we can say that these are 
Cartesian coordinates reckoned from Mo and Mp, respectively) and 
the index “zero” indicates that the derivatives are taken at the point 
Mo. Comparing these formulas with formulas (27) we conclude that 
a non-linear mapping can be regarded as a linear mapping in an 
infinitesimal neighbourhood of any point with an accuracy of infi- 
nitesimals of higher order of smallness. 
On the basis of Sec. 6, we conclude that if the determinant 


a a 
3z ay _ D(z, ¥) (49) 
oy dy | P(e, y) i : 
Ox Oy 


(see the notation in Sec. 1X.13) is unequal to zero at a point Mo 
the mapping in question is one-to-one in an infinitesimal neighbour- 
hood of the point. Moreover, it can even be regarded as affine with 
an accuracy of infinitesimals of higher order, Besides, the absolute 
value of the determinant is equal to the proportionality factor 
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defining the change of the areas of infinitesimal geometric figures 
(placed infinitely close to the point) under the mapping. The coef- 
ficient is no longer constant in the whole plane as it was in the case 
of a linear mapping because the determinant takes on different 
values at different points in the general case. In particular, the 
meaning of the determinant makes it possible to attribute a certain 
geometric meaning separately to the denominator and to the nume- 


rator of the expression De v, Namely, we can consider them as 
being equal to the areas of an infinitesimal figure before the mapping 
and after the mapping, respectively. 

If Jacobian (49) vanishes at a point then the mapping in question 
degenerates at the point, namely, the area ofan infinitesimal geometric 
figure becomes an infinitesimal of higher order of smallness after 
the mapping has been performed. Finally, if Jacobian (49) is iden- 
tically equal to zero the mapping degenerates throughout the whole 
plane which leads to a reduction of the dimension: the plane can 
be mapped into a line (not necessarily into a straight line) or even 
into a point. 

One must not think that in case Jacobian (49) does not turn into 
zero at all the points of a finite domain the mapping in question 
will be one-to-one in the domain. A mapping can be non-degenerate 
at all the points and nevertheless it may not be one-to-one (see 
Fig. 225). 

A mapping of a three-dimensional space into another three-dimen- 
sional space possesses similar properties. Such a mapping can be 
defined by formulas (47). Here the value of the Jacobian me a 2) 
is also essential. Its absolute value is equal to the factor of propor- 
tionality defining the changes of the volumes of infinitely small 
solids. (What is the geometric meaning of the sign of the Jacobian?) 

If the Jacobian is identically equal to zero we can pose 
the problem of determining the “degree” of the degeneration, that 
is whether the space 2, y, 2 will be mapped onto a surface or onto 
a curve (or even into a point) of the space z, Y, 2. The answer to the 
question is implied by the considerations given in Sec. 6 and by the 
fact that every mapping is linear in the small (to within infinitesimals 
of higher order). Thus, we must investigate the matrix 


Ox Oy Oz 

ay oy ay 

a a T em) 
ze «oz 
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In the case of degeneration its rank is less than three at any point 
(why?). If it is equal to 2 everywhere (except for some points at 
which the rank can be reduced still lower) then the æ, y, z-space is 
mapped onto a two-dimensional! surface. If the rank does not exceed 
unity at any point but does not vanish identically then the space 
is mapped onto a one-dimensional curve. Finally, if it is identically 


2 


equal to zero [this means that all the elements of matrix (50) are iden- 
tically equal to zero] the whole space will be mapped into a point 
because this can be only if z, y, z are identically constant. 

A similar situation occurs when we consider mappings of spaces 
of arbitrary dimensions which may be different in the general case. 
As it has already been said, formulas (X.5) can be regarded as for- 
mulas defining a mapping of an m-dimensional space with the coor- 
dinates t, t, ..., Én into an n-dimensional space with the coor- 
dinates Zi, Zo, ..., Zn. To determine the dimension of the manifold 
which appears as a result of the mapping we must compose an 
(n X m) matrix of the form 


on Oxy Ox, 

at Oty TIFA 
Ox» Oa, GERS 

Oty Als Otm 
În În GEZA 

Ei TA E ATE 


If the rank of the matrix equals k (we admit a further reduction 

of the rank at Some points) then the dimension is equal to k. 
15. Functional Relation Between Functions. The results of Sec. 14 

can be applied to the notion of “functionally dependent” systems 


of functions. Let us first suppose that we are given three functions 
of three independent variables: 


F, (z, y, 2),  \Fa (z, y, 2), kaa Y; 2) (51) 
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We say that the functions are dependent on each other if there is 
a relation of the form 


D (Fy (x, Y: 2), F, (z, Y, z), F; (z, Y, 2)) =0 (52) 


connecting the functions where @ is a function of three variables 
of the form ® = @ (A, u, v) which is not identically equal to zero. 
If otherwise, we say that the functions are independent. We can solve 
relation (52) for one of the variables F,, F, or Fg, and therefore 
we can also say that functions (51) are dependent if one of them is 
expressible as a function of the others. : 

For example, the functions 


2—y \2 
(=): In(jz+z) and z—y (53) 


are dependent because if we denote them as Fy, Fz, F; we have 
Fye2F2 — F? = 0. 

To establish a test for the existence of a functional relation bet- 
ween functions in the general case let us consider an auxiliary map- 


ping of the form 
n= F, (2, y, a} 


u = Fy (z, y, 2) (54) 


v = F, (z, y; 2) 


For functions (51) to be dependent, it is necessary that relation 
(52), that is the relation © (A, u, v) = 0, should be true for all z, 
Y, z. The relation defines a surface in the À, u, v-space. Therefore, 
we see that for functions (51) to be dependent, it is necessary that 
the z, y, zspace be carried into a surface in the A, u, v-space under 
mapping (54). This means that the mapping should be degenerate. 
Applying the result of Sec. 14 we arrive at the condition 


D (Fr, Fay Fs) _ 
D(x, ys 2) 


which is necessary and sufficient for functions (51) to be dependent. 
[Let the reader check up the fulfilment of condition (55) for func- 
tions (58).] 

If the rank of the matrix 


(55) 


OF, OF, OF, 
Ou Oy “Oz 
OF, OF» OF 5 
“Ox. “Oy Oz 
OF 3 OF 3 OF 3 
Oe Oy dz 


equals unity the z, y, 2-Space, as it was shown in Sec. 14, is carried 
into a curve in the A, u, v-space. Taking advantage of equations 
(X.2) we then conclude that in this case the functions Fy, Fa, F3 
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are connected with each other by two independent relations of 
form (52). 

A similar result is also obtained in the case when the number of 
independent variables differs from the number of functions. For 
instance, two functions of three variables of the form F, (x, y, 2) 
and F, (x, y, 2) will be dependent if the mapping 


N= F, (x, y, 2), p = F, (x, y, 2) 
will transform the z, y, z-space into a curve with an equation of 
the form ® (A, ») = Ô in the À, p-plane. According to Sec. 14, 


the condition guaranteeing the above property is that the rank of 
the matrix 


OF, OF, OF, 
Ox oy Oz 
OF, ôF, OP, 
Ox oy 0z 


should he less than two, that is there should be equalities of the form 


ôP, OF, aFy F, OF, OF; 
Ox oy Ox Oz ôy Oz 

=0 = $ T= 
ia i Ea YaF OF, =0 and | or, ar, |=" 
Ox oy Ox oz oy Oz 


The condition for an arbitrary number of functions of any number 
of arguments to be functionally related is put down in a similar 
form. It should be noted that in case the number of functions exceeds 


the number of independent variables the functions are always de- 
pendent. 


CHAPTER XII 


Applications of Partial 
Derivatives 


§ 1. Scalar Field 


4. Directional Derivative. Gradient. Let a Cartesian coordinate 
system z, y, 2 in space be given. Then, according to Sec. IX.9, a 
stationary scalar field can be regarded as a function u = u (x, Y, 2). 
(When investigating a non-stationary field we can apply the same 
point of view at any fixed moment of time.) Besides, let a point M 
in space also be given. Suppose that a curve 
(L) starts from the point M in the direc- L 
tion Z (see Fig. 226). Then the rate of change 
of the field in this direction (related to unit 
length) is called the derivative of u along the z 
direction l: 7 N 

du . u(N)—u(M) s 
ôl PAN As DaM 


(L) 


To compute the directional derivative let Fig. 226 


us suppose that the curve (L) is represented 

in parametric form by the equation r = r (s) where the parameter s 
is the are length reckoned along (L) (see Sec. VII.23). Then the 
values of u taken along (L) form a composite function of the arc 
length: u (s) =u (a (s), y (s), 2 (s)). The sought-for derivative is then 
nothing ‘but the derivative =: Therefore, by the rule of differen- 
tiating a composite function (IX.11), we have 


du du dx ðu dy Bs du dz 
z 


I Ge ds" dy ds “ds 


The right-hand side canzbe represented as a scalar product of two 
vectors [see formula (VII.12)]: 


a (Hitti Ek) (Fit Fite) 
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The first of the vectors is called the gradient of the field (function) 
u. It is designated as 


du. du . ð 
gadu itita E (2) 


The meaning of this vector will be discussed a little later. ‘The 
second vector 


drs dy ., dz, | d(ei+yj-+zk) dr 
Sp jp Spa SE 


= —=T 
ds ds 
is the unit vector in the direction J (see Sec. VII.23). Thus, 
Ou 
a= grad u-t (8) 


The first factor entering into the right-hand side depends only 
on the choice of the point M. The second factor depends only on 


the choice of the direction l. In particular, we see that aa is in- 


dependent of the choice of a concrete curve (L) among all the possible 
curves passing through M in the given direction l. (By the way, 
it should be noted that the derivative i will no longer be inde- 
pendent of the choice.) 

According to formula (VII.5), we deduce from (3) the expression 


a ; 
a = proj, (grad u) = grad, u (4) 


(grad; u designates the projection of the gradient on the axis pas- 
sing in the direction J). 

Note that the derivatives ui, uy and u, are also directional deri- 
vatives: for instance, uy is the derivative in the direction of the 
z-axis. 

Let us put down one more useful formula containing the gradient 
which is based on definition (IX.7) of the total differential: 

ĝu Ou 
ô 


ô 
du= = det ay y + g d= 


BLOW en OUS OU A : 
= (Feit Fe i4+ Ek) (dri + dyj+ dzk) = 
= grad u-d (xi + yj + zk) = gradu-dr 
Let a field u and a point M be given. Let us set the following pro- 


blem: in what direction 1 is the derivative oe maximal? We see that 


on the basis of formula (4) the problem reduces to the following 
question: in what direction is the projection of the vector grad u 
maximal? Evidently, the maximal projection of any vector is ob- 
tained when we take its own direction, the maximal projection 
being equal to the modulus of the vector. 
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Thus, the vector grad u at a point M indicates the direction of 
the maximal rate of increase of the field (function) u, this maximal 
rate (related to unit length) being equal to | grad u |. The faster 
the change of the field, the greater the modulus. See Fig. 227 where 
the outer circle bounds a part of a heat conducting medium which 
is being cooled from outside and which is being heated from the 
internal region (shaded). The arrows represent the vector field of 
gradients of the scalar temperature field in question. We see that 
the gradient of the temperature is directed “toward the stove”. 

The physical meaning of the gradient implies that the relation- 
ship between a scalar field and its gradient is invariant, that is it 
remains the same when an original 
Cartesian coordinate system is replaced 
by another because the rate and the 
direction of maximal increase of a field 
are independent of the choice of a coor- N 
dinate system. [By the way, the origi- a ~ 
nal definition (2) of the gradient which LA AS 
is connected with a particular choice of X 
a Cartesian coordinate system does not 4 $ 
directly imply the invariance.] Moreo- 
ver, if we are given a field u we can find 
the direction and the rate of maximal 
increase of the field uù at every point in 
space and hence we can find the vector 
grad u without using coordinates and 
without representing the field as a function u (z, y, z). Thus, 
vectors grad u form a completely specified vector field of gradients 
corresponding to a given scalar field. 

Analogous conditions of invariance are set for all basic notions 
of the theory of vector field which we will not study in this chapter. 
The matter is that when we change an original Cartesian coordinate 
system the projections of vectors change although the vectors them- 
selves remain invariant. Therefore, if a concept related to the theory 
of vector field is formulated in terms of coordinates or projections 
of a field we must additionally verify whether the concept satisfies 
the condition of invariance with respect to the changes of coordi- 
nates and projections when the coordinate axes are rotated. 

Let us illustrate the application of the concept of gradient to 
the problem of computing the rate of change of a scalar field along 
a trajectory. Suppose we have a field u which may be non-statio- 
nary in the general case, that is u = u (x, y, Z, t). Besides, let 
a certain law of motion of a particle M be given in the form r = r (t). 
If we consider the value of u at M in the process of motion then the 
value becomes a composite function of time: u = u (x (b), y (t), 
z (2), t). To compute the sought-for rate of change of the value we 


Fig. 227 
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can apply transformations similar to the ones given above. This 
leads to the so-called total derivative 


du du du du 
ap Brad aN a a ae 


where v = is the velocity vector of the particle in the process 


of its motion and = is the derivative of u in the direction of the 
tangent to the trajectory. 
In case the field is stationary, that is if sz = 0, we have only 


the first summand on the right-hand side. Hence, this summand repre- 
sents the rate of change of the field which is due only to the transi- 
tion of the point M from one value of u to another along the trajec- 
tory. For instance, if u is temperature such a summand describes 
the changes of the temperature which are due to the transition of 
the point M from one region in space to another region with diffe- 
rent temperature and the like. This is the so-called convective ve- 
locity. The second summand represents the rate of change at a 
motionless point (coinciding with the current position of the moving 
point M at a certain moment of time) which is due to the non-sta- 
tionarity of the field. This is the local velocity. In the general case 
we have both factors which add together and yield the resultant 
rate of change of the field along the trajectory which is the sum of 
the convective velocity and the local velocity. 

2. Level Surfaces. Level surfaces of a field u (x, y, 2) (see Sec. IX.7) 
are the surfaces on which the field assumes constant values, 
that is the surfaces represented by equations of the form u (z, y, 2) 
= const. Depending on the physical meaning of the field in ques- 
tion these surfaces may be called isothermic surfaces, isobaric sur- 
faces and the like. There is a simple relationship between these 
surfaces and the gradient of the field: at each point M the gradient 
is normal (i.e. perpendicular to the tangent plane) to the level 
surface passing through the point M. 

Actually, as it is seen from Fig. 228, the surfaces u = C and u = 
= C 4+ AC can be regarded as being almost plane near the point 
M if AC is sufficiently small, and besides a N z 7 ee But it 
is clear that if l is directed along the normal to the Birface the quan- 
tity As will assume its least value, and i will therefore assume 


its maximal value. This implies our assertion. 

In particular, we see that the assertion cnables us to solve the 
following problem: to find the equation of the tangent plane passing 
through a point Mo (£o, Yo, Zo) of a surface (L) having an equation 
of the form F (z, y, 2) = 0. To solve the problem let us introduce 
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a scalar field in space by means of the equation u = F (z, y, z). 
Then (L) becomes one of the level surfaces of the field because we 
have u = F (x, y, 2) = 0 on the surface. Then the vector 


OEN OF 4 OF 
(grad ujm = (az) oe on) Jt (a) ie 
(the subscript “zero” indicates that the corresponding derivatives 
are taken at the point Mo) is perpendicular to the sought-for tangent 


plane. Hence, according to Sec. X.7 (see problem 2), we obtain 
the equation of the plane: 


(+), (oe) an (=), (Y — yo) + (+), (z—z) =0 (5) 


The last equation can be put down as dF = 0. Let the reader think 
how we could deduce this equation 
in a direct way. 4 

A surface for which the tangent 
plane is to be constructed can be 
represented by an equation of the 
form z =f (a, y). Here we can 


Zz 


Fig. 229 


tan & = E) tan Bp = (34). 


rewrite the equation as Z — f (£, y) = 0 and denote its left-hand 

side by F (æ, y, 2). Then formula (5) is directly applicable, and 
6 ô 

thus we have —(#) (a — zo) — (£), (y — yo) + @ — 20) = 0, 

i.e. 


a n O 


The right-hand side being equal to the total differential df, we 
thus obtain the geometric meaning of the total differential of a 
function of two independent variables. Namely, the differential 
is equal to the increment of the third coordinate of the point in 
the tangent plane (see Fig. 229). 

Take an example. Let us compute the gradient of a centrally 
symmetric field u = f(r) where r= |r| = VEF FZ. In 


24—0141 


370 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


this case the level surfaces are concentric spheres with centre at the 
origin of coordinates (why is it so?). If we take two spheres for 
which the difference of their radii is equal to dr then the difference 
of the corresponding values of the function f which are taken on 
these surfaces will be equal to df. Therefore, the rate of change of 
the function in a direction which is transversal to the level surfa- 


ces (that is along a radius) is equal to af Hence, 
dr 


d d 
gradu (r) =H ot ty (7) 


Ps . 5 . s 
where r° => is the unit vector in the direction of the vector r. 


[Let the reader obtain result (7)®on the basis of definition (2).-] 

3. Implicit Functions of Two Independent Variables. Implicit 
functions of two arguments were discussed in Sec. [X.13. Now we 
can approach them from a new point of view. Let us consider the 
equation 


F(z, y, 2) =0 (8) 


in the vicinity of the point Mo (zo, yo, Zo) at which the equation 
is satisfied. The equation defines a surface (L) in space passing 


i OF 
through the point M. If (FZ 0 [see condition (IX.16)] then, 


by formula (2), the vector (grad F)y, has a nonzero component in 
the direction of the z-axis. This implies that the tangent plane to 
(L) passing through M, [which is perpendicular to (grad F)i10l 
is not parallel to the z-axis. Therefore, near the point M, at which 
the surface (L) touches the plane the angle between the Z-axis and 
the surface (L) is different from the right angle. Hence, in the vici- 
nity of Mo equation (8) defines a relationship of the form z =z (z, y). 
This functional relationship is local (i.e. it is defined only near Mo 
or, as we say, “in the small”) because if we take a point which is 
not sufficiently close to My we can encounter the case when there are 
A values of z corresponding to given values of z and y or when 
there are no such values at all (see Fig. 230). It should be noted that 
the condition for the existence of a system of implicit functions 
established in Sec. IX.13 is also of a local character because it 
guarantees their existence only inthe vicinity of the point in question. 


OF 
is (Fh = 0 then the tangent plane to (L) is parallel to the 
z-axis at the point under consideration (as it is at the point No in 
Fig. 230). In such a case it can happen that even for some points 
(z, y) lying very close to No equation (8) does not define a one-valued 
function z = z (x, y). For instance, we see that some values of x 
and y taken near the point Ny yield two possible a ofz but 
at the same time there are no such values at all for other values of 
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x and y because the surface is convex in the direction of the radius 
at the point No. But if, for example, we have T 0 at No then 


near No equation (8) defines a function y = y (x, z). Equation (8) 
can happen not to define any coordinate as a function of the other 
coordinates near a point belonging to a surface (L) represented by 
an equation F (z, y, z) =0 only if we simultaneously have 


OF oF OF 
h apm? a=? (9) 
at this point. 


Points at which conditions (9) hold are called singular points of 
the surface (L). A “typical” point belonging to a surface defined by 


[~ 


Fig. 230 Fig. 234 


equation (8) is not singular because such points are found by sol- 
ving the system of four equations (8) and (9) in three unknowns 
x, y and z which is inconsistent (overdetermined) in the general 
case. Therefore most of the surfaces have no singular points. Among 
the well-known surfaces only conic surfaces possess singular points 
which are their vertices. 

4. Plane Fields. All the notions established for space fields are 
transferred with corresponding simplifications to plane fields (see 


the end of Sec. IX.9). For instance, the gradient grad u = & i+ 


+ j of the field u (z, y) is a vector lying in the z, y-plane. The 


gradient of a plane field is normal to the level line, that is to the 
curve represented by an equation of the form u (x, y) = const, 
at each point (z, y) (see Fig. 231). In this case the 
meaning of the gradient implies that its modulus is ap- 
proximately inversely proportional to the distance between 


24* 
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the level lines, that is the level lines are closer to each other in those 
regions where the gradient is longer. 

Naturally, the equation of the tangent line to a curve represen- 
ted by an equation of the form 


f(x, y) =0 (10) 


is obtained from (5) by dropping the third summand. 
Equation (10) locally defines a function y = y (x) if fy == 0. Sin- 
gular points of curve (10) are the points for which 


fe 0 mands f— 0 (11) 


Let us introduce a surface (L) having the equation z = f (x, y). Then 
curve (10) can be interpreted as the line of intersection of (L) by 
the plane z = 0. If conditions (10) and (41) are fulfilled at a point 


mM ER velope 


- Curves of the family 
(a) (b) (c) 


Fig. 232 Fig. 233 


(a) Isolated singular point 
(b) Nodal point (c) Cusp 


formula (6) implies that the plane is tangent to the surface (Z) 
at the point. In Sec. 9 we shall investigate the form of the line of 
intersection of a surface with its tangent plane near the point of 
tangency. We shall see that usually singular points of a plane curve 
are isolated points, nodal points (double points) or, more seldom, 
cusps (see Fig. 232). 

5. Envelope of One-Parameter Family of Curves. Let us consider 
a family of curves dependent on a single parameter C (a one-para- 


meter family). The general form of an equation of such curves can 
be put down as 


F(z, y, C) =0 (12) 


Making C assume a certain concrete value we isolate an individual 
curve from the family. It often happens that the disposition of the 
curves resembles Fig. 233. In such a case we say that the family 
possesses an envelope, that is a curve (which usually does not enter 
into the family) which touches some curve of the family at each 
of its points. To find the equation of an envelope we note that each 
of its points belongs to a’ curve of the family and equation (12) is 
therefore satisfied at each point. But, at the same time, in 
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moving along the envelope the value of the quantity C [determining 
the curve of the family which the envelope touches at a point (z, y)l 
varies: C = C (æ). Differentiating equality (12) with respect to x 
(after the ordinate of the envelope which is a function of x has been 
substituted for y) we obtain 


jt + F; Y envelope + Fc Cx = 0 (13) 


where Ynvetope is the slope of the envelope (i.e. of its tangent line) 
at an arbitrary point M. But, the envelope touching a curve of the 
family at the point M, the slope of. the envelope equals the slope 
of the curve, that is Ycurve = Yenvelope. The quantity ycurve is found 
from equation (12) by differentiating with respect to x for a fixed C: 


Fy + Fy Yeurve = 0 (14) 


Hence, (13) and (14) imply that Fo Cx = 0. But, as it has been indi- 
cated, C (x) isa variable quantity and therefore, in general, Cx (x) ~ 
= 0. Consequently, we have 


Fe (2, y, O =0 (15) 


Thus, for the points of the envelope, equations (12) and (15) hold 
simultaneously. Eliminating C from these two equations we arrive 
at the equation of the envelope. 

Example. Let us consider the family of the trajectories of 
motion of a shell under assumptions enumerated in example 4 of 
Sec. II.6, when the initial velocity vo is given, for different values 
of the angle of inclination œ. Here œ serves as a parameter of the 
family, and therefore, in order to find the envelope (see Fig. 234), 
we differentiate the equation of the family {equation (I1.11)] with 
respect to a: 

z gz? sin a 
= costa  vecos? a 


Expressing tan æ from the last equation and substituting this value 
into the equation of the family we obtain the equation of the envelope: 


sy UO EA 
Y=, Dy” 


Hence, the envelope is a parabola, the so-called safety parabola (why 
is it called so?). 

Let us take one more example. As we know, all the normals to an 
evolvent touch the evolute (see Sec. VII.26) and thus the evolute 
is the envelope of the family of all the normals to the evolvent. This 
property implies an approximate method of constructing an evolute: 
we draw several normals to the evolvent and then trace their envelope. 

We must take into account that if the curves belonging to a family 
in question possess singular points (see Sec. 4) then, when elimina- 
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ting C from (12) and (15), we obtain, besides the envelope, the curve 
which is the locus of singular points (see Fig. 235). Virtually, as it 
was shown in Sec. 4, we have F} = F} = 0 for such points and 


J 


Safety parabola 


at rg 


Fig. 234 Fig. 235 
(a) Envelope (b) Locus of nodal 
poin 


therefore in this case (13) implies (15) even if the slopes of the locus 
of singular points and of the curves of the family do not coincide, 
that is if Viocus Yeuroe- 


§ 2. Extremum of a Function of Several Variables 


6. Taylor's Formula for a Function of Several Variables. For 
definiteness, let us consider a function f (x, y) of two variables. 
(Similar results are valid for an arbitrary number of independent 
variables.) It turns out that formula 
(IV.62) remains true for such a func- 
tion f without any changes. 

To prove the assertion let us take an 
arbitrary direction and draw a ray pas- 
sing through a point (a, b) in the z, y-plane 
in this direction (which is indicated 
by the arrow in Fig. 236). The values 
of the function f which are taken on 
this ray es only on the single argu- 
: ment p, i.e. f (z, y) = f* (9). We can thus 
Figs: 236 apply formula (IV.62) to o function f*. 


ae Pare We have Af* = Af but at the same time, 
im investigating the relationship between the differentials of /* 


and f, we must take into account the followi i 
Fig ae aon ollowing fact. According to 


t=a+pcosp, y=b+psing (a, b, p = const) (16) 
and therefore 


f* (p) = f (z, y) = f (a + p cos p, b + p sin ọ) 


Hence, in computing df, df, ... we consider the variables * 
and y to be independent whereas in computing df*, d?/*, ... we 
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consider them to be dependent on p. As we know (see Secs. IX.12 
and IX.16), this does not matter for the first-order differentials, 
that is we have df* = df, but, generally, this is essential for higher- 
order differentials. But in our case formula (16) implies that 


Pr = des... =0 and dy = Fy... =0 


Therefore in this particular case formulas (IX.23) and (IX.24) 
show that d?f* = @f and, similarly, af* = d'f etc. 

Thus, formula (IV.62) for the function 7* automatically implies 
the validity of the same formula for the function f (z, y). 

In practical applications we usually truncate the formula thus 
retaining only one or two terms. Then we get (see Sec. IX.16) the 
formulas 


f (a +h, b +k) = f (a, b) + fx (a, b) h + fy (a, b) k + 
-+ the terms of the order of smallness not less than the second 
(relative to h and k) (17) 
and 
f (a +h, bF k) = f (e, b) + fa la, b) h + fy (a, b) k + 
+ 4 Ifan (as b) hè + fay (as b) hk + fiy (as b) HI + 
+ the terms of the order of smallness not less than the third (48) 


As in the case of a function of one argument, formulas (17) and 
(18) can be applied if |h | and |k | are sufficiently small because 
otherwise the formulas can lead to incorrect results. In all cases 
when we apply Taylor’s formula we suppose that the corresponding 
derivatives exist and are finite. 

7. Extremum. As in Sec. 6, we shall take, for the sake of simpli- 
city, the case of a function z = f (2, y) of two arguments. The de- 
finition of an extremum is similar to the one introduced for functions 
of one independent variable (see Sec. IV.18). For instance, we say 
that a function z =f (x, y) has a maximum at “a point” (that is 
for certain values =o and y =yo) if the value f (zo, Yo) is greater 
than all the “neighbouring” values of the function f, i.e. than the 
values f (a, y) taken for x and y which are, respectively, sufficiently 
close to xo and Yo. 

In this section we shall consider only extrema that are attained 
in the interior of the domain of definition of a function f and, besides, 
we shall suppose that the function f itself and its partial derivatives 
have no discontinuities. Fig. 237 approximately represents the dis- 
position of the family of level lines of a function f of this type (see 
Sec. IX.1) near its point of extremum. 

We can easily establish the necessary condition for an extremum. 
Indeed, if we fix y = Yo and make zx vary then the corresponding 
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point (x, y) in Fig. 237 will move along the straight line Zl, and 
the function f = f (x, yo) (regarded as a function of x) will have 
an extremum at z = 29. The function f = f (£, yo) depending only 
on z, we have fx (xo, yo) = 0, according to Sec. IV.18. This is 
a partial derivative since it is taken for a fixed y. We similarly 
consider the case when the point (z, y) moves along the straight 
line mm and thus deduce the following necessary conditions for 
an extremum: 


Íx (to, yo) = 9, fy (to, Yo) = 0 (19) 


(in the case of a function of a greater number of independent variables 
we must similarly equate to zero all the partial derivativės of the 
first order). A point (lying in the z, 
y-plane) at which conditions (19) hold 
is called a critical (stationary) point 
of the function f. Consequently, if the 
conditions imposed in the preceding 
paragraph hold all the points of extre- 
mum of the function f are its critical 
points. 

Conversely, let a critical point (£o, yo) 
of the function f(z, y) be found. Is 
it correct to assert that this point 
must be a point of extremum? If there 

Fig. 237 is only one critical point in a certain 

region and if we are sure that some 

i physical or other conditions guarantee 

the existence of the extremum then our answer is affirmative. In 

other cases we must apply some sufficient conditions which we are 
going to study now. 

As we know from Sec. IV.18, the necessary condition for an ex- 
tremum of a function f (z) of one independent variable which is 
expressed by the equality f (zo) = 0 is at the same time “almost 
Sufficient” because if in addition we have f” (zo) = 0 the extremum 
at the point z = Zp is sure to exist. But it turns out that for the case 
of a function of several variables this is not so. If conditions (19) 
; aly for f (x, y) and the partial derivatives of the second order of 

e function of two variables are different from zero at the point 
(xo, yo) the extremum nevertheless may not exist. Thus, in the case 
of functions of many variables we can have a situation of a new type. 

For example, see the “graph” of the function z = f (z, y) = 
=£ +y" depicted in Fig. 220. Conditions (19) yield a single 
Stationary point in this case which is the point (0, 0). Evidently, 
there is a minimum at this point because f(0, 0) =0 and z >0 
at the other points. At the same time, for the function z = —2? ++ y? 


ne aps 
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we have an essentially different case. Its “graph” is shown in Fig. 221. 
Again, the only critical point is the origin of coordinates. We have 
z = y? for x = 0, that is the function increases when we move from 
the origin along the y-axis in both directions and, as a function of 
one variable y, it has a minimum at the origin. But if y = 0 then 
z — —z? and hence along the z-axis the function decreases in both 
directions and has a maximum at the origin. Taking other straight 
lines passing through the origin we see that the function also has 
a maximum at the origin for one group of these straight lines and 
hav a minimum for the other group (by the way the function as- 
sumes the constant value z=0 on two straight lines which are y = 
= +z). In such a case we say that the function has a minimax at 
the point in question. Hence, the function z = —z” + y? has neither 
a maximum nor a minimum at the origin although necessary con- 
ditions (19) are fulfilled in this case and the partial derivatives 
< = —2 and 5 = 2 are different from zero. 

After discussing these examples let us proceed to investigate 
the functions of general form. Suppose conditions (19) are fulfilled 
at a point (£o, Yo) for a function f (x, y). Let us see whether or not 
the function f has an extremum at the point. In order to do this we 
take Taylor’s formula (18) and put a = Zo, b = yo init. This results 
in 


Af =f (to + h, yo + k) — Í (zo Yo) = ; 
=! fis (eo, yo) IB + Bf (20s YO) Wk + fiy (Bos YO) I + 
- the terms of the order of smallness not less than the third 


The terms of the first order of smallness relative to h and k do not 
enter in the result since the stationarity conditions (19) imply that 
these terms are equal to zero. The terms of the third order being con- 
siderably smaller than the terms of the second order for sufficiently 
small |% | and |% |, the sign of the right-hand side is determined 
by the group of terms of the second order. Hence the sign coincides 


with the sign of the quadratic form 
P (hy K) = fhe (Eo, Yo) hè + fav (wos Yo) hk + fuu (or Yo) k? (20) 
(we do not put down the factor s here since it does not affect the 


sign we are interested in). Consequently, if the sum (20) is positive 
for all h and k (of course, except for h = k = 0 when it vanishes) 
then we have Af >0 for sufficiently small |k |, | & | which implies 
that f (zo +h, yo +k) >f (2o Yo). Hence, we have a minimum 
at the point (xo, Yo) in this case. If the sum is negative then we 
likewise conclude that there will be a maximum at the point (xo, Yo). 
Finally, if the sum can assume the values of both signs there will 
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be a minimax at the point (£o, yo) and thus there will be no extre- 
mum. We cannot judge by the sum of the terms of the second order 
whether there is an extremum only when the sum can vanish for 
certain values of k and k which are not equal to zero but 
does not change its sign (in particular, this is the case when the 
sum is identically equal to zero and does not enter into the expres- 
sion of Af). In such a case we must take into account the subsequent 
terms of Taylor’s formula and take the sum of the terms of the third 
order. Then a similar (but, of course, more complicated) investiga‘ ion 
of the sum for the values of k and k which turn the sum of the terms 
of the second order into zero can indicate whether we have an extre- 
mum and so on. We are not going to carry out this investigation 
here. 

These conclusions can be similarly drawn for functions of any 
number of independent variables. But in the case of a function of 
two variables we can easily proceed to express the sufficient condi- 
tions for the extremum directly in terms of the values of the second 
derivatives at the point (£o, yo). To do this let us take k? outside 


the brackets on the right-hand side of (20) and denote £ = 7. Then 
we obtain 

P (h, k) = U(faxdo + 2 (fay)o t + (fyy)o #1 A? (21) 
[the subscript “zero” indicates that the values of the derivatives are 
taken at the stationary point (x9, yo)]. As is well known from ele- 


mentary algebra, the polynomial in ¢ inside the square brackets has 
two distinct real roots if its discriminant is positive, i.e. if 


(foc y > — (fex)o Gio >0 (22) 
In this case the polynomial changes its sign when passing through 
the roots, and we therefore have a minimax here. But if 

(fev)> — (faxo (fiv)o < 0 
then the polynomial has imaginary roots and consequently it does 


not change its sign (why is it so?). Therefore we have an extremum 


in this case. To find out what is the sign of the risht- ide of 
(21) we put ¢ = 0. Then we see that it eee side o 


(feu)o — (fatx)o (fiv)o <0, (fax) >0 (23) 


then the right-hand side of (21) is positive for all ¢ and thus, by 


the results of the preceding paragraph, the function f has a mini- 
mum at the point (zo, yo). Similarly, if ; 


(zv)o — (fex)o (fyu)o < 0, (feclo < 0 24) 
then the function f has a maximum. Finally, if 


Caudo — (ex) o (fyy)o = 0 (25) 


APPLICATIONS OF PARTIAL DERIVATIVES 379 


then the polynomial entering into (21) has a double root and thus 
it does not change its sign but it can vanish for some nonzero values 
of h and k. Thus, this is an ambiguous case. 

The condition guaranteeing that quadratic form (20) should be 
positive can also be deduced from the general theory of quadratic 
forms (see Sec. XI.14). According to the theory, after a certain 
rotation of the coordinate axes, form (20) has been transformed to 
a “diagonal form”, that is to the form 


P = uh? + Aok? (26) 
where M and A, are the roots of the characteristic equation 
ee : 
(f so ; auo ty (27) 
(fsv)o (fv)o— A 
and h’ and k’ are the increments of the new coordinates (which occur 
after the rotation has been performed). Equation (27) implies that 


MAs = (fex)o fyy)o — (fev)os M + he = (fax)o + (fyu)o 
(check it up!). We can easily deduce from these equalities that in 
cases (22)-(25) we respectively have Aiha <0 or Ay >0, Ap >O0 
or Ay <0, Ag <0 or ñA = 0 (we leave it to the reader to verify 
these assertions). From this, on the basis of equality (26), we deduce 
the same conditions as those in the preceding paragraph. 

In order to investigate the behaviour of a function f (x, Ta: -= 
. , 2n) depending on an arbitrary number n of arguments at a 
stationary point we must take the quadratic form 


>» i (fzx )o ihj (28) 
i, j= 

instead of (20) and the equation A 
det (A — MI) = 0 (29) 


[where A = ((fx,«,)o)nn} in place of (27). If all the roots of equation 
(29) are positive then form (28) assumes only positive values at all 
points (hı, ha, . - +) An) different from the point h =h =... = 
= h, = 0. A quadratic form of this type is said to be positive definite. 
In this case the function f has a minimum at the stationary point 
in question. If all the roots of equation (29) are negative then form 
(28) is negative definite, and the funċtion f has a maximum. But 
if equation (29) possesses roots of both signs then the function f 
has a minimax.* 


* It can be shown that an equation of form (29) with the matrix A = 
= (f£ x o) has -only real roots because the matrix is symmetric. See also 


Sec. XI.40 on this question.— Tr. 
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8. The Method of Least Squares. As an example illustrating appli- 
cations of the theory of extremum of functions of several arguments 
we shall consider the least-square method which can be applied to 
constructing empirical formulas. In particular, the method is used 
when the accuracy of the approximate method described in Sec. 1.30 
is insufficient. It is also applied to automatizing calculations. Here 
we shall dwell only on the problem of selecting a linear functional 

_ relationship in the case of one independent variable. In performing 
these calculations we usually apply the following way of reasoning 
which can also be applied to other functional relationships: the 
sought-for function is of the form y = kx + b but the values of 
the parameters k and b are yet unknown. The substitution of x = z; 
into the formula should have resulted in ka; + b but the experi- 
mental data give the value y;, and thus we have the difference 
yi — ka; — b between the theoretical and experimental data which 
is due to the errors of the experiment and of the calculations, the 
non-linearity of the relationship under consideration etc. This diffe- 
rence between the left-hand side and the right-hand side of a for- 
mula is called a discrepancy. 

Therefore, let us try to select k and b in such a way that the sum 
of the squares of these discrepancies, that is the quantity 


N 
S= > (yi — kzi —b)? 
= 


should take on its minimal value among all the possible values. 
We can also take a sum of other even powers or, for instance, the 
sum of the absolute values of the discrepancies, but this will involve 
more complicated calculations. At the same time we must not take 
the sum of the discrepancies themselves because it can be small 
when the | absolute values of the summands (which can be of 
different signs) are large, Thus we arrive at the problem of finding 
a minimum of the function S = S (k, b). Applying necessary con- 
ditions (19) we see that for the function to have a minimum, it is 
necessary that the equalities 


j N N 
Sh= — È 2 (y: —kr:—b) 21=0, S= =È 2 yi kzi —) 0 
should be fulfilled. It follows that 


N 


N N N N 
kM a = : 
Py si + 4 2 i È Tas k & m + als Sy 2 a 


Hence, we have obtained a simple system of two algebraic equa- 
tions of the first degree in two unknowns from which & and b can 
be found. All Ti and yi being known, the system can easily be solved. 
The fact that in this way we really obtain a minimum of S is implied 
by the meaning of the problem in question. 
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Similar methods are applied to selecting other empirical formu- 
las and to some other problems. 

Let us take an example. Suppose that an experiment carried out 
in order to establish a relationship between some quantities « and 
y indicates that these relationships are approximately expressed 
by the equations 


z+y=5.8 
a+ 2y = 8.1 \ (80) 
2z + 3y = 13.2 


The system is formally inconsistent because adding together the 
first two equations we arrive at a contradiction with the third equa- 
tion. But this can be a result of the errors of the experiment! There- 
fore let us try tosatisfy system (30) as precisely as possible so that the 
sum of the squares of the discrepancies should be as small as possible. 
Thus, we are going to find the values of z and y such that the quantity 


S= (£ + y — 5.8)? + (t + 2y — 8.1)? + (22 + 3y — 13.2)? 


assumes its minimal value. Applying necessary conditions (19) 
we obtain z 


S —=2 (z4+y—5.8)+2 (r +2y—8.1)+2 (22+3y—13.2) 2=0, 
Si =2 (1+y—5.8)+2 (1 +2y—8.1) 2+2 (21+3y—13.2) 3=0 


Cancelling out the factor 2 we get 


6z + 9y = 5.8 + 8.1 +2 X 13.2 = 40.3 } 
9z + 14y = 5.8 + 2 x 8.1 + 3 x 13.2 = 61.6 


Solving the equations in the simplest way we find z =.3.3 and 
y = 2.3. Of course, these values only approximately satisfy system 
(30). Naturally, the greater the number of relationships of form (30) 
between z and y, the more reliable the values of xz and y thus obtai- 
ned. Of course, this is so if there are no systematic errors in the 
experiment (random errors occurring in some relationships mutually 
cancel). Many other systems of approximate equations can be solved 
in like manner, In particular, the method can be applied to systems 
of empirical equations when the number of equations exceeds the 
number of unknowns. 

The method of least squares was discovered by the French mathe- 
matician A. M. Legendre (1752-1833) and by Gauss. It has many 
useful applications at present time. 

9, Curvature of Surfaces. The classification of stationary points 
described in Sec. 7 is-directly related to the classification of surfaces. 

Let us consider an arbitrary surface (S) and take a point M on it 
(see Fig. 238). If we draw the normal nn to the surface at the point 
and then draw an arbitrary plane (P) passing through the normal 

_the plane will intersect (S) along a plane curve ll which is called 
a normal section of (S) at M. The curve ll has a certain curvature k 
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at the point M (see Sec. VII.24). Now if we rotate the plane (P) 
about the normal nn the normal section will vary, and its curvature 
k will therefore also vary, in the general case. To investigate the 
law of this variation let us choose a Car- 
tesian coordinate system so that the 
origin of the coordinates should be at 
the point M and the z-axis should go 
along the normal nn. Then in the vici- 
nity of M the surface (S) can be repre- 
sented by an equation of the form 
z = z (z, y), and the point M [at which 
z (0, 0) = 0] will be a stationary (criti- 
cal) point of the function z (x, y) (why 
will it be so?). By arguments similar to 
the ones applied at the end of Sec. 7 
Fig. 238 we conclude that after a certain rotation 
of the coordinate axes about the axis nn 

has been performed the equation of (S) turns into the form 


Bis 4 (Ayx’? + Agy’?) + the terms of higher order of smallness (34) 


where x’ and y’ are the new coordinates replacing x and y. Let the 
plane (P) form an angle @ with the plane x’ Mz. Then passing to 
poar coordinates we obtain z’ = p cos ọ and y’ = p sin ọ which 
yie 


4 P: 
z= 5 (M Cos? p + Az sin? ọ) PH. 


d 
+ A, sin? ọ. Formula (VII.37) (tbo: implies the expression k = 
= | A, cos? p + Ag sin? p | for the curvature. Accordingly, there 
can be three cases here, namely the following cases: 

1. Let MAs > 0, i.e. let 4, and A, be of the same sign. Then all 
the normal sections have the same direction of convexity near the 
point M, and the values of k lie within the limits |, | and | Ag |. 
Besides, we have k = | A, | for p = 0, that is for the plane x’ Mz, 


and k = |ñ, | for p =~, that is for the plane y’Mz (these are 


the so-called principal normal sections). A point M of this type is 
called an elliptic (or umbilical in case | A, | = | Ap |) point of the 
surface (S). For instance, all the points of an ellipsoid or of a hyper- 
boloid of two sheets are elliptic. Equation (31) implies that the 
tangent plane to (S) at the point M has only one common point M 
with the surface (S) near the point M. The planes parallel to the 
tangent plane which are drawn sufficiently close to it and which 
intersect (S) yield the intersection lines of the form of an infinitesi- 


Hence, at the point M we have = 0 and a = A, cos? @ + 
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mal ellipse whose axes lie in the planes of principal normal sections. 
The form of the sections can become more complicated when the 
distance from the plane (parallel to the tangent plane) to the point 
M is increased (see Fig. 239a). 

2. Let MAs < 0. Then some of the normal sections have a posi» 
tive curvature at the point M and the direction of convexity coin- 
ciding with the direction of the outer normal to the surface near 
the point M, and some other sections have a negative curvature 
and the opposite direction of convexity. For instance, this is the 


9 

g 9 

m m 
m m 
m 
a g g 
(a) (b) (c) 
Fig. 239 


mm and qq are principal normal sections 


case for the points of a hyperboloid of one sheet. One of the sections 
belonging to the first group has the maximal curvature | A, | (or | As |) 
and one of the sections of the second group has the maximal curva- 
ture | Aa | (or, respectively, | A, |). These principal normal sections 
are also mutually perpendicular. A point M of this type is called 
a hyperbolic point of the surface. The tangent plane to (S) at the 
point M intersects the surface (S) along two curves which form 
a nonzero angle (i.e. the angle between the tangent linesto the curves) 
at M. The planes parallel to the tangent plane which are drawn 
infinitely close to M intersect (S) along hyperbolas lying infinitely 
close to M, the axes of the hyperbolas being directed along the 
principal normal sections (see Fig. 239b). 

3. Let MAs = 0. Then, if A, and A, are not simultaneously equal 
to zero all the normal sections have the same direction of convexity 
near M and have a nonzero curvature at M except for one of the 
sections which has the zero curvature at M.The curvature of the 
section which is perpendicular to the section of zero curvature is 
the maximal at the point M. A point of this type is called a para- 
bolic point. For example, all the points of a cylindrical or conical 
surface are of this type. A typical disposition of lines of intersec- 
tions of a surface (S) with the planes parallel to the tangent plane 
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drawn at a parabolic point M of the surface is shown in Fig. 239c 
but there can be some other dispositions in different cases. The case 
when A, = A, = 0, that is when k = 0 for all g, also belongs to 
this type. In such a case a point of this type is called a planar point 
of the surface (S). It is clear that all the points of a plane are of 
this type. 

In the above concrete examples all the points of each of the sur- 
faces were of the same type but this must not be necessarily so in 
the general case. For instance, the surface of a torus has points 
belonging to each of the three types (where are these points placed 
on the surface of a torus?). ` 

In all cases the product AA, is called the total (Gaussian) cur- 
vature of the surface (§) at the point M. There is a remarkable pro- 
perty of the total curvature: when a surface is bent without stret- 
ching its total curvature does not change. For instance, if we take 
a sheet of paper and bend it in an arbitrary way the surface thus 
obtained will have the zero total curvature at each of its points. The 
same cause makes it impossible to flatten a portion of a sphere with- 
out deformation, and there cannot therefore be a geographic map 
without distortion. 

A surface can have singular points (usually these are isolated 
points or “conical” points similar to the vertex of a circular cone 
but of course there are also singular points of more complicated 
types). It can also have singular lines which are loci of singular 
points (most often these lines are isolated lines or lines of self-inter- 
section; there are also “cuspidal edges” and some other types of sin- 
gular lines). 

10. Conditional Extremum. In the problems considered in Sec. 7 
we investigated extrema in the case when independent variables 
were not connected by any additional relationships. An extremum 
of this kind is called an unconditional extremum. But there are 
also problems concerning a so-called conditional extremum when 
arguments are related to one another by relationships of the form 
of an equality. We begin our investigation with functions of two 
independent variables. 

Suppose we seek for a maximum or a minimum of a function 
z = f (a, y) on the condition that z and y are restricted by the 


relationship 
F (z, y) =h (32) 


Equation (32) is called a coupling equation [conditions of form (32) 
are also called subsidiary conditions, side conditions or constraints]. 
Thus, we consider and compare only those -values of the function f 
that correspond to the points (lying in the z, y-plane) which belong 
to the curve represented by equation (32). For instance, in Fig. 240 
we see level lines of a function f. At the point K the function attains 
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its maximum (unconditional). At the same time there are three 
conditional extrema here, namely two maxima at the points A 
and C and one minimum at the point B (think why it is so). An 
unconditional maximum can be compared to the top of a mountain. 
Then it is natural to compare a conditional maximum to the highest 
point of a mountain path whose projection on the z, y-plane has 
an equation of form (32). 

If it is possible to express y in terms of x with the aid of equation 
(32) then we can substitute the result y = y (x) into the expression 


Fig. 240 


of z and thus obtain z as a function of a single independent variable: 
z = f Iz, y @l (33) 


When substituting y = y (z) we have taken into account condition 
(32), and, since there are no other restrictions, the problem reduces 
to finding an unconditional extremum of z = f [z, y (x). There 
will be a similar situation if it is possible to solve equation (32) 
for z or if the curve defined by equation (32) can be represented by 
parametric equations. 

It should be noted that the above resolution of equation (32) 
may not be possible in some cases and, besides, this can be incon- 
venient and can lead to complicated expressions even when equation 
(32) is solvable. In such a case we can reason in the following way. 
Coupling equation (32) defines some relationship y = y (£) which 
may not be known in an explicit form. Consequently, z is a composite 
function of the independent variable z of form (33). The necessary 
condition for an extremum can therefore be put down in the form 


25-0441 
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on the basis of the formula for the derivative of a composite function. 
Here the expression a designates the derivative of the implicit 


function y (z) defined by condition (32). Hence, by Sec. 1X.13, we 
have 


i z dy f dy _ Fy 
F.+Fy 5, =9, 1.€. a Fr, 
Substituting this expression into (34) we see that at the point of 


a conditional extremum we have 


C Pa à f fy 
—=7f=0, ie r 
fx Fy fy , FR Fy 


Let us denote the value of the last ratio by A. Then we have 


on (35) 
at the point of a conditional extremum. This can be rewrilten as 
fe — AF, = 0, fy —AF, = 0 (36) 

Let us introduce the notation 
f* (æ, y A) = f (z, y) — AE (z, y) (37) 


where À is an undetermined parameter which is called Lagrange’s 
undetermined multiplier (factor) named after Lagrange who intro- 
duced this method. Then equations (36) can be put down in the form 


fe’ = 0; ff’ =0 (38) 
Thus, we have arrived at equations of the same form [see equations 
(49)] but for the modified function f* defined by formula (37) instead 
of the original function f. Equations (38) together with the coupling 
equation (32) form a system of three equations in three unknowns 
which are x, y and À. The points in the z, y-plane at which a con- 
ditional extremum may be attained are found from these equations. 
The conditions thus obtained are only necessary. Sufficient con- 
ditions guaranteeing the existence of a conditional extremum at 
a point defined by equations (38) and (32) can similarly be deduced 
from. the sufficient conditions for an unconditional extremum estab- 
lished above but we shall not do this here. [By the way, in our case 


it is sufficient to compute the derivative Z and to investigate its 


sign at the point defined by (32) and (38).] 

Lagrange’s multiplier 4 has a simple meaning. To illustrate it 
denote the coordinates of the point of a conditional extremum by 
Z and y. Let z designate the corresponding extremal value of z. 
Up till now the quantity h entering into equation (32) has been 
considered to be fixed. But now we can vary h and then all the 
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three quantities z, y and z will also vary, i.e. z, y and z will become 
functions of h. The identity z (h)= f [z (h), y (h)] holding, we have 


dz, dz | dy 
abe ath Ee (89) 


On the other hand, by (32), we have 


, dy 


dz 
A 


aE 4 (40) 


From (39), (35) and (40) we readily deduce z = À. Hence, the 
factor À is equal to the rate of change of the extremal value z when 
the parameter h entering into coupling equation (32) varies. 

The investigation of a conditional extremum in the general case 
of an arbitrary number of independent variables and any number 
of coupling equations is carried out in like manner. [We remind 
the reader that according to Sec. X.2 the number of coupling equa- 
tions (subsidiary conditions) must be less than the number of argu- 
ments.] 

For instance, if we are looking for an extremum of a function 
f (£, y, z, u, v) when the arguments z, y, z, u and v are restricted 
by the conditions 


Fy (@, Y, 2, U, v) = 0, F, (x, Y, 2, 4, v) = 0 and 
F, (z, y, 2, u,v) = 0 (41) 


we must perform calculations as if we wanted to find an uncondi- 
tional extremum of the function 


ify = f — MP, — AF. — Ask 5 


where Ai, Às and A, are undetermined Lagrange’s multipliers. The 
stationarity condition for f* yields the equations 


Y=0, f=0, ff =0, ff’ =0 and fy =0 


which, together with equations (41), forma system of 8=54+ 3 
equations in 8 = 5 + 3 unknowns @, y, 2, U, V, M, Ap and Ag. 
Methods of solving a problem of a conditional extremum form 
the basis of one of the widely spread numerical methods of finding 
an unconditional extremum. This is the so-called method of steepest 
descent. Here we shall describe a variant of the method applicable 
to the case of a minimum of a function of two arguments although 
in the general case different. modifications of the method can be 
applied to finding minima (or maxima) of functions of any number 
of independent variables. Let it be necessary to find a point of 


25* 
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minimum of a function f (x, y). It turns out that when the function 
f does not belong to a certain class of the simplest functions it is 
rather difficult (and even impossible) to solve equations (19). Be- 
sides, in solving equations (19) we find all the stationary points in- 
cluding those which are not points of minimum and hence we do 
much unnecessary work. It is therefore better to apply the method 
of steepest descent which is an iterative method. We begin with 
taking a point Mo (£o, yo) as a zeroth approximation. At this point 
the function f has the maximal rate of decrease in the direction of 
the vector — (grad f)y, = —fx (Zos Yo) i — fy (Xo, Yo) į (because, as 
was shown in Sec. 1, the direction of the vector grad f indicates 
the direction of the maximal rate of increase of the function). Let 
us draw a ray through the point Mo in this direction and consider 
the values of the function f which are taken on the ray. These values 
are expressed by the quantity f (£o — fit, yo — fyot) which we 
“regard as a function of ¢ for t >0. After that we find a value of £ 
for which this function of one variable attains its minimum. This 
value of ¢ determines a new point M, (zı, yı). Then we draw a ray 
through the point M, in the direction of the vector —(grad f)an 
[(grad f)ar, designates the gradient at the point M,] and find a point 
M, at which f attains its minimum on the ray etc. In many cases 
this method enables us to find an approximate position of the sought- 
for point of extremum within a sufficient accuracy after several 
steps of this kind have been carried out. (We suggest that the reader 
should consider level lines of a function f on the z, y-plane and 
find out the geometric meaning of the method.) Methods of this 
type which enable us to find extrema without using necessary con- 
ditions are called direct methods. 

_ 11. Extremum with Unilateral Constraints. Independent variables 
involved in an extremal problem can be restricted by one or several 
conditions of the form of an inequality. Such conditions are called 
unilateral (one-sided or non-restricting) constraints. For example, 
let an extremum of a function f (z, y) be sought for and let the in- 
dependent variables be related to one another by the constraint 
F (x, y) > 0 which defines a domain (S) with a boundary (L) 
in the x, y-plane (see Fig. 241). The curve (L) is defined by the 
equation F = 0. The function f can have both extrema attained 
in the interior of (S) and extrema attained on (L). In order to find 
the former we can use conditions (19) defining stationary points 
but these conditions do not apply to extrema on the boundary (L). 
To find the latter extrema we remark that if the function f has an 
extremum at a point M belonging to (L), for instance, a minimum, 
then the value f (M) is smaller than all the values of f taken on (L) 
near M. Therefore there will simultaneously be a conditional mini- 
mum of f at M for the coupling equation F = 0. Hence, such points 
can be found with the help of the methods described in Sec. 10. 


‘te 
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These results can be applied to the problem of finding the greatest 
and the least values of a function. For instance, let a function z = 

- f (x, y) be considered in the domain depicted in Fig. 242. Suppose 
that neither the function nor its derivatives have discontinuities 
in the domain. If the point at which the function attains its greatest 
value lies in the interior of the domain then there will be an uncon- 
ditional maximum of the function at this point. If the point belongs 
to the contour bounding the domain but does not lie at the vertices 
A, B and C then there will be a conditional maximum of the func- 
tion at this point, and the equation of the corresponding arc of the 
contour will serve as condition (32). Finally, if the greatest value 


Fig. 244 Fig. 242 


is attained at a vertex (A, B or C) then in order to find this greatest 
value we must additionally compare the values of the function at 
the points A, B and C with its other extremal values. 

Thus, to find the greatest value we must find all the points of 
unconditional maximum lying in the interior of the domain and 
all the points of conditional maximum belonging to the contour. 
Moreover, in case it is difficult to specify beforehand which of the 
possible points of conditional extremum will be the points of maxi- 
mum it is advisable to find the values of the function at all these 
points. After that we compare with each other the extremal (maxi- 
mal) values taken in the interior of the domain, the extremal values 
attained on the contour and the values of the function at the points 
A, B and C and thus find the sought-for greatest value of the func- 
tion. The least value of a function is found in like manner. As in 
Sec. IV.19, it is better to seek the greatest and the least values of 
a function simultaneously. i 

If the domain under consideration contains points of disconti- 
nuity of the partial derivatives of the first order the values of the 
function at these points should be included into the set of values 
which we compare with each other because it can turn out that the 
greatest (or the least) value of the function is attained at some of these 
points. If there are points of discontinuity of the function then we 
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should additionally investigate the behaviour of the function in 
approaching such points. If there are lines of discontinuity of the 
function or of its derivatives then the values of the function taken 
on such lines must also be investigated which leads to the problem 
of a conditional extremum. Finally, if the domain in which we 
investigate the function extends to infinity we must additionally 
investigate the behaviour of the function when the variable point 
in the z, y-plane approaches infinity. 

Functions of a greater number of independent variables are inve- 
stigated in a similar way. But in performing such an investigation 

i we must take into account some addi- 
D tional factors which are due to the 
increase of the dimension. For instan- 
ce, if we consider a function u = 
= f (z, y, z) in the domain shown in 
Fig. 243 then we must consider its 
unconditional maxima in the interior 
of the “curvilinear tetrahedron”, its 
conditional maxima with one coupling 
equation on the “faces” of the tetra- 
hedron (in this case the equations of 
the corresponding surfaces serve as 
: : coupling equations), the conditional 
Fig. 243 maxima with two coupling equations 
on the “edges” of the tetrahedron 
[the role of the coupling equations can be played by the equa- 
tions of the corresponding space curve if they are put down in 
form (X.2)] and, finally, the values of the ftinction at the “verti- 
ces”. Comparing all these values with each other we find the greatest 
value of the function. Similarly, if there are surfaces of disconti- 
nuity of the function we come to the problem of a conditional maxi- 
mum with one coupling equation, and if there are lines of discon- 
tinuity we come to the problem of a conditional extremum with 
two coupling equations etc. 

If we apply an iterative scheme of the type of the method of 
steepest descent described in Sec. 10 then in the case when there 
are several minima (maxima) in the domain in question we can 
come to a minimum (maximum) which is not the least (greatest). 
In such a case it is advisable to apply the method several times 
beginning with different zeroth approximations chosen at random. 
For instance, after such repeated calculations have been carried 
ee pee am ouyn a smaller minimum than the one found in the 

calculation, and in many prob i 
least value of the function. Boreal ge lead to the 

12. Numerical Solution of Systems of Equations. In conclusion 

we shall consider some methods of numerical solution of a system 


z 
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of two equations in two unknowns. The case of a system of n equa- 
tions in n unknowns can be treated similarly. 

The iterative method has the same form as in Secs. V.3 and VI.5. 
To apply the method we rewrite the system in question in the form 


z= f (z, y) } 42 

y = g (z, y) (22) 
Then we choose a zeroth approximation z = 2, y = Yo. The sub- 
sequent approximations are constructed according to the formulas 


x =f (£o Yo) Z, =f (to ys) 
y = g (2o Yo) \, Ya = 8 (to Y) } as 


If the process is convergent then in the limit we obtain a solu- 
tion of system (42). The smaller the rate of change of the functions 
f and g when their arguments vary (that is the smaller the absolute 
values of the partial derivatives of the functions), the better the 
convergence of the process. 

A modification of the method (the Seidel method) which is based 
on using some numerical values obtained at each step of the calcu- 
lations for computing other values at the same step may sometimes 
accelerate the convergence. Such calculations are performed accor- 
ding to the following scheme: 2 = f (£o Yo); Ys = g (Ti; Yo) 
Ta = f (Gs Y) Ya = & (Ta ys) and so on. 

Newton’s method (see Sec. V.2) is based on the replacement of 
given functions by their linear approximations constructed with 
the help of the values of the functions and their derivatives for 
the values of the arguments approximately equal to the sought- 
for solutions. Suppose we have to solve a system of equations of 


the form 
P (z, y) = of 
Q(z, y) =9 


Let us begin with a zeroth approximation £ = Zo, y = Yo to the 
sought-for solution which can be found by means of an approximate 
sketch of curves (43) in the z, y-plane or which can be implied by 
the physical meaning of the problem and the like. Taking expansions 
(17) of the functions P and Q into powers of k = x — x and k = 
= y — yo and dropping the terms of higher order of smallness we 
arrive at the following system of equations: 


P (£o, Yo) + Px (to Yo) (© — Xo) + Py (£o; Yo) (Y — Yo) = 4 (44) 
Q (xo, Yo) + Qx (£o: Yo) (£ — zo) + Qy (£o Yo) (Y — Yo) = 9 


System (44) approximately replaces system (43). Solving (44) which 
is a system of linear equations we obtain the values of the first 
approximation x = 2, and y = y,. The second approximation is 


(43) 
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found from system (44) after z, and y, are substituted for zo and yo 
in it and so on. The relationship between the nth and (n + 1)th 
approximations is of the form 


P (Tn: Yn) +P x (a, Yn) (n41—Fn) + Py (tr; Yn) oo 
Q (En: Yn) +Q (ta, Yn) (n41—Fn) + Qy Cons Yn) (Yn+i—Yn) =0 


If the process is convergent we pass to the limit as n— oo. In 
the limit the last two summands in each of the equations vanish 
and thus we see that the limiting values satisfy system (43). It can 
be shown that if the initial approximation is chosen sufficiently 
close to the sought-for solution and if the Jacobian (see Sec. IX.13) 


D(P,Q). p 7 
Den 8 unequal to zero, i.e. if 


then the approximations are sure to converge. [How is the Jacobian 
related to system (44)?] 

We can also take advantage of the fact that a solution of system 
(43) simultaneously gives minimum to the function V (z, y) = 
= ÍP (z, y)? + [Q (z, y)]? (why is it so?). Instead of this function 
we sometimes take a similar expression in which there are certain 
numerical positive coefficients in front of the squares of the func- 
tions which are introduced to balance the “significance” of both 
equations (43). Taking a function V of this type we then find its 
minimum with the help of one of the direct methods mentioned 
in Sees. 10 and 11. [Obviously, it is senseless to apply the necessary 
conditions for an extremum to the function V because this will 
lead us back to system (43). Let the reader verify this assertion.] 
If the minimal value thus found is equal to zero the point of the 
minimum yields a solution of system (43). 


CHAPTER XIII 


SS ES 


Indefinite Integral 


§ 1. Elementary Methods of Integration 


i. Basic Definitions. Let the function f (x) be the derivative of 
a function F (z), i.e. F’ (z) = f (x). Then F (x) is said to be an 
antiderivative (or primitive) of f (z). For instance, the function 
3z? is the derivative of x°, and x is an antiderivative of 32°. 

Differential calculus deals with the basic problem of finding the 
derivative of a given function and with the problem of finding its 
differential which is directly related to the former problem. For 
functions of one independent variable, this problem was considered 
in Chapter IV. In particular, as it was shown in Sec. IV.5, the deri- 
vative of any elementary function is an elementary function which 
is found by means of standard rules. 

The main problem of integral calculus is reverse to that of diffe- 
rential calculus. This is the problem of finding a function when the 
derivative of the function is given, that is the problem of finding 
antiderivatives of a given function. The significance of the problem 
will be discussed in Chapter XIV. This problem is more complicated 
than the problem of differentiation. (As a rule, “reverse” problems 
are more complicated than the “direct” ones. For instance, the 
problem of extracting a root is more complicated than the problem 
of raising to a power.) In particular, we shall see that although the 
antiderivative of any elementary function exists (and these are 
the functions that we shall deal with in the present chapter) it 
may not be an elementary function. 

A given function possesses more than one antiderivative. For 
example, we have not only (x3) = 3a? but (° + 5)’ = 32” as well. 
(It often turns out, in different examples, that solutions of reverse 
problems are not unique.) In general, if a function f (z) has 
antiderivatives F, (z) and F, (x) then Fi = fand Ff, =f, ie. F, — 
— FL =N (lies F) = 0, and thus we have F, — F, = const 
(see Sec. IV.17) and F, = F, + const. Consequently, any two anti- 
derivatives of the same function differ in a constant summand. Hence, 
in order to obtain all the antiderivatives of a given function it is 
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sufficient to take one of the antiderivatives and to add an arbitrary 
constant to it. For instance, the family of all antiderivatives of the 
function 32? is given by the formula zë? + C where C is an arbitrary 
constant. Making C assume concrete numerical values we obtain 


particular antiderivatives: 23, 23 + 5, 2 — V2, 23 + 2 etc. 


The family of all antiderivatives of a function f (x) is called the 
indefinite integral of the function f (z) and is denoted by the symbol 


\r (x) dx whose meaning will be discussed at length in Sec. XIV.2. 


Here f is the integral sign, f (x) is the integrand and f (x) dz is the 
element of integration. Thus, 


if F" (2) =f (æ) then f odr =F aye and 
vice versa (1) 


For instance, { 3a dz = z? + C. In other words, the indefinite 


integral is the general expression of antiderivatives which involves 
an arbitrary constant, and every concrete numerical value of the 
constant yields a certain concrete antiderivative. 

Formula (1) implies that 


(J fear) =f, a (frla) dr) =t) az, 
| (QF (a) =F (a) +C i (2) 


Therefore, the signs of integration and differentiation mutually 
cancel out. Th 


‘ e result of computing an indefinite integral can always 
be verified by finding the derivative of the result. Tf the answer M4 
correct the differentiation must yield the integrand. To each for- 
mula of differential calculus (see Secs. IV.4-5) there corresponds 
a certain formula of integral calculus. 

_2. The Simplest Integrals. These inte 
sing formulas for differentiation of bas 
Sec. IV.5). For instance, formula (sin 


grals are obtained by rever- 
ic elementary functions (see 
x)’ = cos x implies 


| coszdz=sinz +C (3) 


{see formula (1)]. The formula (cos zx)’ 


k Hany = —sin z, or (—cos z) = 
= sin z, implies 


| sinzdr= —cosz-+C 
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Similarly, 
j — dz=tanz-+C (this can be written as f ans tan z+ c) í 
Eer —cotz+C and | ae sinz +e (4) 
From formula (are cos x)’ = ——— we deduce 
Vi-2 
f es = — arc cos t -+C (5) 


At first glance one can think that the latter formula contradicts the 
former. But this is not so because formula are sin x + arc cos t = 5 
(see Sec. IV.18) and formulas (4) imply that 


y dz A ; a 
| | asinsi C= arc coss hz + = —arc cos t -+ C: 
2 VW1i-—2 2 


where C; = 5 + C. Thus, the matter is that the right-hand sides 


of formulas (4) and (5) contain different arbitrary constants. Such 
a discrepancy in different forms of an answer can appear in many 
other examples of indefinite integrals. Naturally, in concrete com- 
putations one must choose either formula (4) or formula (5); for 
instance, we can choose formula (4). 
Other formulas of differentiation imply 
dx 


| asame tan + and {famste 


A disadvantage of the last formula is that the function 4 whose 


antiderivative is being computed exists both for z >O and for 
x< 0 whereas the right-hand side is defined only for x >0. But 
we can easily verify that there is a more general differentiation 


formula of the form (In | z |)’ =o Indeed, we have |z | =x 
for z > 0 and therefore our formula yields the ordinary derivative 
of the logarithmic function in this case, and we have |x | = —z 


for z < 0 and hence, for this case, we have (In | x \)’ = Qn (—2))' = 
= + (1) = 2 Therefore, formula 


d [i 
J sn]e]+C (6) 


is valid both for z >O and for z<0. 
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Further, we deduce 


| d= +C (in particular, Poi +c) 


and 
- an ; m tee 
a 1dr== +C, that is Yi dz = TE +C 
Of course, the last formula does not hold for m = —1 because the 
denominator vanishes in this case. But for m — —1 the integral 


turns into [= which is computed by formula (6). 
Further, Wwe have 


j cosh z dz = sinh £ +C, { sinh zdz = cosh z+ CG 
=tanhz+C and 


T 
| cosh? x 
dz 


f Va oo On (c+V2+1)+C 


(the formula for sinh-! z was deduced in Sec. 1.28). 
The formula deduced above can be obtained without hyperbolic 
functions if we directly put it down in the form 


dx 
V2+i 
a then differentiate the answer. Moreover, we have (In | u |)’ = 
=e ux and thus 


=In(@e+V2+1)+C 


| Serle +V Fae (a = const) 
because ; 
(In|z2+V aba ee ce N 2n 
D t+ V2Fa ( + 2V2+a ls Vr Fa 
where a is a constant ofan arbitrary sign. 
The above formulas form the table of integrals of the simplest 


integral and try to change the answer in such a manner that its 
derivative should be equal to the integrand. Essentially, this method 
reduces to applying formula (1). For example, in order to find the 
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integral 
f cos 3z dx (7) 


it is natural to take formula (3). But the answer sin 3z + C does 

not apply since the derivative of sin 3x + C is equal to 3 cos 3z 

but not to cos 3x which is required. But if we divide sin 3x by 3 the 
+ 


derivative will be multiplied by m Hence, ($sin 3z + c) = 


— cos 3z, i.e. \ cos 3z dr = $ sin 3z + C. 
In like manner we find 
| eer a rete sin Be + 5) $C \ = 
Vi-@zt52 2 : p z—3 
d d 
--Injx—3|+C, i= f ea oe 
me RAs) 
= 2arc tan Sy noe (8) 
(check up the answers by means of differentiation!). 
Generally, if an integral j f (£) dz = F (z) + C has been found 


then i 
| f(ax+0)de= F (z+) +C 


where a and b are arbitrary constant numbers. 

3. The Simplest Properties of an Indefinite Integral. These pro- 
perties are implied by the analogous properties of a derivative (see 
Sec. IV.4). For instance, 


J ywo G dz= f ræ drf o) de (9) 


that is the indefinite integral of an algebraic sum is equal to the 
sum of the integrals of the summands. To prove the property we 
take the derivatives of the left-hand side and right-hand side and 
then verify that the results will be equal on the basis of the first 
formula (2) and on the basis of the well-known property of deriva- 
tives which asserts that the derivative of a sum is equal to the sum 
of the derivatives of the summands. The derivatives being equal, 
the corresponding functions can differ only in a constant summand. 
Thus the proof is completed because the constant must not be put 
down in formula (9) since the integral signs include arbitrary con- 
stant summands. 
We similarly verify that 


far (x) dx =A j f()dx (A= const) (10) 


Thus, a constant factor can be taken outside the integral sign. 
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A given integral can often be represented in the form of a sum 
of tabular integrals with the help of formulas (9) and (10). Then we 
perform termwise integration and thus obtain the answer (this is 
the decomposition method). Let us consider several examples; we have 


j (3825—21 +5) dz =| (32%) dz— | (22) dx + f 5d = 
aa j edx—2 j adz+5 j dr=3 7—22 }5r+C= 
=J rta 5r4C (11) 


(of course, we have put down only one arbitrary constant here be- 
cause a sum of arbitrary constants is an arbitrary constant), 
=n. a aOR MA WON. E í de 
aig” g@ )_ gt gq 
14 1 
1 x 1 x 
= ay @are tan——-+- € = —-are tanz +C 


[compare with example (8)] and, similarly, 
dz AES 5 
faa resin tc (a>0) (12) 


(verify the answer!). 
Other examples are 


| tan? x de = d= ET ics 


cos? x 


=) — 1) dx= j = f de=tanz—2+C, 


jses aE de Gear) t= 


z—1 


=Inl2—4|—In|2|+C—m|==*| + ¢, 
1 — { eta —(z—a) 
jae da = j 2a (x—a) tji 
TAER 1 1 1 = 
=Á f (— pera ee are |te 


In computing the last two integrals 
neral method of representing a given fracti 
of simpler fractions. When using the method we factor the denomi- 
nator and then try to represent the numerator as a combination 
of factors entering into the denominator. If this is possible we get. 
a decomposition of the fraction into a sum of several fractions and 
perform cancellation in each of the summands. 


we have applied a ge- 
on in the form of a sum 
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Let us take one more useful example. Let it be necessary to find 
the integral f sin 5z cos 3z dz. From trigonometry we know the 
formula 

sina cos B = [sin (a +B) + sin («—f)] 
Therefore, 


f sin 5a cos 3x dz = jee ae = a cos 82—4 cos 2x +-C 
In similar circumstances we also utilize the formulas 
sina sinB=—> [cos (x — B) — cos (a +-B)I, 


cosa cosp =- [cos (œ + B) + cos (a —B)], 


sinta = and costa, = Ette 
For instance, ` 
ne lente ue to. te os 
{ sin? 32 dz= | 7 dr=5 1-2 sin 62 4-C 


We now mention one more interesting technique based on apply- 
ing complex functions of a real argument (see Sec. VIII.6) for which 
all the integration formulas remain valid. Obviously, if we integrate 
such a function its real part and imaginary part will also be inte- 
erated, that is if f(z) =u (£) + iv (z) then 


| f(a) de= Í (w(@) +iv (@) dz= j u (2) de+ | iv (2) dz= 
= { (Ref (2)) deri J (Im f (2) dz 
Therefore Re (í f@ dz) = f (Re f (z)) dz and Im ( f f (x) dz) = 
= j (Im f (z)) dx. For instance, this enables us to find the real 
integral 


| e" cos ba dx = í Re [e**e'"*] dx = Re j elatibyx da = 


(a-pib)x eax (cos bz- i sin bz) (a—ib) 
=-he C=Re———— as t 
acos br-+-bsin br 
} C=" eo ae + C 


by means of Euler’s formula (see Sec. VIII.4). 
4. Integration by Parts. Unfortunately, there is no formula expres- 
sing the integral of a product of functions in terms of the integrals 
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of the factors. As we know, the derivative of an elementary function 
is always an elementary function. But the integral of an elementary 
function may not be an elementary function, and this fact is con- 
nected with the above property. (The notion of an elementary func- 


tion was introduced in Sec. 1.18.) For instance, we have the tabular 
` sinz 


integrals { sin zdz and j4 dx but the integral = dz is not 


expressible in terms of elementary functions. Integrals of this type 
will be discussed in Sec. 14. 

But if we integrate both sides of formula (uv) = u'v + uv’ (see 
Sec. IV.4) we obtain 


uv = f u'vdz+- f uv’ dx 
that is 
j uv’ dx = w — j u'v dx (13) 


or, which is the same, 
| udv=w— J vdu (14) 


Formula (13) or the equivalent formula (14) is called the formula 
of integration by parts. When applying formula (13) we factor 
the integrand into two factors, i.e. u and v’, and then differentiate 
the first factor and integrate the second. Hence, we pass to an inte- 
gral in which w’ substitutes for u and v for v’. After such a transfor- 
mation we may arrive at a tabular integral or at an integral which 
is simpler than the original one. 


Take some examples. In calculating the integral | z? In 2 dx 


we see that it is advisable to differentiate In z because this yields 
a power function which is simpler than the logarithmic one. Of 
course, we must simultaneously integrate the other factor (x?) but 
this yields a power function again. Hence, putting u = ln z and 


4 
dv = z? dx we find w’ =—,U= as and thus we have 


2S) 3 3 
jz Ine de—=Ina— J =z d(Inz) = In t— 


z2 3 3 
— j 5 l= lne C 


It should be noted that while computing v we did not put down the 
corresponding arbitrary constant, that is we did not write v = 


= Sto, because for our aims it was sufficient to obtain a single 
function v. 


i 
i 


| 


\ hae 
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Similarly, we often try to differentiate arc tan z and arc sin z 
because this yields simpler functions. 


When computing the integral j= sin 3z dx we should diffe- 


rentiate the power function since this reduces the exponent by 
unity. Therefore, integrating by parts twice in this manner we arrive 
at a tabular integral. At the same time, the differentiation or inte- 
gration of the sine yields trigonometric functions and thus this 
factor is neither simplified nor complicated. Hence, we have 


\ q? sin 3a dz = m 00s 32+ j cos 32-22 de 
(we have used the expressions 4 = z? and dv = sin 3x dz here, 
ie. du = 2adz and v= -4 cos 3z). Further, denoting u = 7 


and dv = cos 3z dz, i.e. du — d? and v =4 sin 3r, we finally 
obtain 
¥ f te 
| z? sin 3x a e cos 32+ (= sin 32— j 4 sin 3zdz) = 
-A acos 3z +4 xsin3e-+ yy cose +6 


It can happen that after integration by parts we obtain the ori- 
ginal integral on the right-hand side but with another coefficient. 
Then, combining similar terms we can compute the integral. For 


instance, we compute the integral f Vi —x*dz by introducing 
u=Vi—@ and dv= dz (i-e. du = ETET dz and v =z): 
MEEN ——; 2 
(VT ad= VT] 2 dxr=xV 1—2°+ 


Vyi-2 
eiti gp V1 -ë | Vi-edet+ 
+\ Sea zk, J 
+f i dz- VIZA | Vi-#de+aresine 


Vi-2 
Now, transposing the integral thus obtained to the left-hand 


side we receive 


2 j Vi-# de=xV1—2*+are sin ztC 


where Ç is a constant (because the expression {vi — ada, an 
indefinite integral, which we have transposed is defined to within 
a constant addend). Consequently, 


— 


S 4 f: 
f reno AU sinz+ 


where C, = = is an arbitrary constant. 


26—0144 
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5. Integration by Change of Variable (by Substitution). Here we are 
going to describe one of the most widely spread methods of integral 
calculus based on the formula of differentiation of a composite 
function (Sec. IV.4). Suppose a function F (a) is an antiderivative 
of f (x), and let x depend in a certain manner on ¢, i.e. z = ọ (t). 
Compute the derivative of F (x) with respect to t: 


LF (a)li = [F (lexi = f (2) p’ () = fle lg’ (8) 
If we integrate both sides with respect to ¢ we get 


F(a)+C=) fip O1 a 
Therefore, by formula (1), we have 


[Jf@ar]| =| fle@le’ wat (15) 
It is this formula that is the basic formula of integration by 
substitution (by change of variable). 
We have g’ (t) dt = dz, and the right-hand side of formula (15) 


can therefore be rewritten as f f (£) dx. But in the process of inte- 


gration we do not regard z as an independent variable but con- 
sider it to be dependent on t. 


_ Consequently, formula (15) can be interpreted as follows: any 
integration formula of the form 


| f@)de=F (a) + (16) 


remains valid if we make an arbitrary substitution z = ọ (¢) both 
in the right-hand side and in the element of integration. Any for- 
mula of the form (16) is invariant in this sense. 

For instance, substituting z = u? into formula (3) we obtain 


J cos uè d (u) = sin uS+C, that is f u? cos u? du =- sin ue+C 


and the like. Of course, when we apply formula (15) to a practical 
problem we do not start from a tabular formula; on the contrary, 
we try to find a change of variable which reduces a given integral 
to a certain tabular integral. ` 
Let us ei some examples. To get rid of the radical in the 
> : 


integral f ESA dx we perform the change of variable z = (2 and 


Veo t e414 
| Teris | mtua | ai 


dt S 2 
=2 ( j dt — 1r) =) (t—are tan tC) =2 (V z—are tan Vaz)+C 


INDEFINITE INTEGRAL 403 


Hence, after performing the substitution and integration we must 
make the reverse substitution, that is we must pass from ¢ to z. 

We sometimes regard the right-hand side of formula (15) as given 
and apply the formula for computing the left-hand side, that is 
we make the substitution (xz) =u instead of z = ọ (t). For 


example, to find the integral f ze® dx we take advantage of the 
fact that the element of integration can be simply expressed in 
terms of z? because z dz =4 d (x°). Therefore, putting 2° = u, 
2x dx = du we obtain 
f ze? dr = j et du=4 "+C =5 +C 

Substitutions of the form yp (z) = ọ (¿) are also applied in some 
problems. 

We could compute integral (7) by means of the substitution 3z = 
= 4, 3de =d 
j cos 3z dx = j cost = + f cosżdt=-% sin t+ C= 4 sin3z+C 


We can sometimes perform calculations of this type without putting 
down the change of variable explicitly; for instance, 


j cos3rdr= j cos3a 22) =+ f cos 3z d (32) = sin 32+ 


3 
Here we have used the invariance of formula (3). 
Similarly, 
i d 
f tan z dz = f d= = j G08 2) —In|cosz|+C 
In general, we have 
f@ 7, _{ $O— 
J TE d= f ore O (17) 


Further, we have 


4 
f pects IE, f (e41) 24d (+1) 
4 


V21 


7 eee 


2 


Such an integration formula can be put down in the general 


form as 
1 


1 z 
I (2) so -5 $ PACAN = rN 
Er ae= | ve 2 df (2) ae 4+C=2VF@+C (18) 


26* 
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In particular, by means of formulas (17) and (18) and by the 
method of completing a square, we can compute the integrals of 


the form 
ax-+-b az-+-b 
J Vere Mi | pare 
which are widely encountered. 


For example, let us illustrate the computation of the integral 
22—3 


> dz. To do this we take into account that the 
V-ati 
derivative of the radicand is equal to —6z+2= —6 (2-4): 


3 
3 2[(-5)+5]- 
| [Ses &- J Be 
2 (2-5) -3 


| ymn] Veeran e 


ei) 


Ee 
eiar \ ae sts J (a 


Il 


f 1 
-4V eE |a 


4 
2 /— az o ADE 
o — aro sin 2 +C= 
73h 
ee ee 
= og Va eri aren Bot (19) 


[see formula (42)}. 
he problem of integration is much more complicated than the 
- problem of differentiation. The reader should exercise much in 
order to learn elementary methods of integration. 


§ 2. Standard Methods of Integration 


; Here we shall present some classes of functions which can be 
integrated by means of certain standard methods. It should be 
noted that In some cases these standard methods may not be the 
simplest. It is often advisable to perform certain preliminary trans- 
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formations or to apply directly methods of § 4 in order to simplify 
calculations. But the reader will be able to find the simplest tech- 
nique leading to the desired result only after necessary experience 
in computing integrals will be acquired. 

6. Integration of Rational Functions. Rational functions are 
integrated on the basis of the results of Sec. VIII.10. As was shown, 
any rational function (rational fraction) can be represented in the 
form of a sum of an entire rational function (a polynomial), in case 
the fraction in question is improper, and partial rational fractions. 

A polynomial can be integrated termwise by means of the sim- 
plest methods [for instance, see example (11)]. Partial fractions 


of the form can also be integrated quite easily. 


—a)® 
For instance, if it is necessary to integrate function (VIII.38) 
then, by (VIII.39) and (VIII.42), we obtain 


ee ns gs ee ee ee 
\ z(e—1) eae ot = f [-zst+tE=9 5 ert 
5a 2 4 3 2 oe! 
+2 in|2+2|-+ const 
Hence, now we must consider partial fractions of the form 
Mz+N 
—— —4qg<0 (20) 
Bae Aar 
We begin the integration with a simplification of the numerator. 
Namely, taking into account that (2? + px +q)' =2 (F3) 
we replace z in the numerator by (z+4) i and then combine 


similar terms without removing the parentheses. After that we 
break up the integral into two integrals [as in calculating expres- 
sion (19)]. The first integral is of the form 


P 
j (:+4) a is di f d(s?+pz+a) 
mpprp? 2) +per +o? 
and we therefore find it immediately. The second integral is of the 


form f Ya 

ppt í 
in the denominator which results in e+ pr +g=(1+a? +b 
where a and b are constants. Now, if we put z + a = y we arrive 


at the integral 


To compute it we complete the square 


1 
r= | aap (24) 
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which can be easily found in the case B = 4 (how can we do it?). 
To find integral (21) for B = 2, 3, .. . we shall deduce a recurrence 
formula which will enable us to pass from Ip to the simpler integral 
Tp, etc. The formula is obtained by means of integration by parts. 


We have 
1 b 1 b+ y2) — 2 
b=5 |) =z fy 


(y2-+-b)P (y2-+0)8 
aot 1 y 
=F T f EA 
H ae SUG eae i.e. H ~s y dy : 
ere we put u=y, dv moy W ie. du=dy, v era 
etc dwty —1 1 
2 J wo 28-1 papai. 
Hence, 
4 
Te y my 
Demure meer res 
cin ghia 1 H y 2p—3 99 
ae OD poe Y= py port BHP CA 


(let the reader verify all the calculations!). 

As we have already mentioned, formulas of this type are called 
recurrence formulas. Such formulas express an unknown quantity 
dependent on a number (this is the quantity I, with the number B 
in our case) in terms of similar quantities with lower numbers (this 
is the quantity Zg_, with the number P —1 in our case). These 
formulas may not yield the solution immediately but they enable 
us to obtain the solution after several Successive reductions of the 
number. Thus, formula (22) expresses Ig in terms of Ig-;. If we 
repeatedly apply the formula to 7 p-1, that is if we substitute p —1 
for B into formula (22), we obtain the expression of Ig_, in terms of 
Is-2 etc. Finally, we arrive at the integral J, which is immediately 
found, as has already been indicated. 

It is worth noting that in the above calculations we did not use 
the fact that the trinomial in the denominator of expression (20) 
has Imaginary roots. The procedure can. therefore be applied to 
integrating a fraction of form (20) when the denominator has real 
roots i. decomposing the fraction into two summands of the 


form a" 


Integral (21) for b >0 can also be computed with the help of the 


substitution y = V6 tan t. This leads to an integral of a power 
of cos ¢ which will be discussed later. 
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Hence, the integral of a rational fraction is always expressible 
in terms of elementary functions, and this can be achieved by means 
of the above standard methods. The elementary functions in terms 
of which an integral of this type is expressed are rational functions, 
the logarithmic function and the arc tangent. The most difficult 
thing in the integration is the factorization of the denominator in 
accordance with formula (VIII.29). 

Methods of computing many integrals of other types which we 
ave going to study here are essentially based on the transition from 
a given integral to an integral of a rational function by means of 
suitable substitutions. This is the so-called rationalization of the 
integral which reduces the computation to the above standard 
methods. 

7. Integration of Irrational Functions Involving Linear and 
Linear-Fractional Expressions. First we take an integral of the form 


| RG, /az+b) dz (n=2, 3, =) F(23) 


where a and b are constants and R (z, y) is a rational function of 
its two arguments x and y (see Sec. 1.17). The integrand is an irra- 
tional function here because it contains the radical. To rationalize 
the integral let us use the substitution 


az} b= t", adr = n dt 
which yields 


| R (2, ar F0) de = fR (Z t)ar dt 


The integrand in the last integral is a rational function (why?). 
Similarly, an integral of the form 


\ R(x, Y az Fb, Y ax+b, ...)dz (n,m=2, 3, 4, ...) (24) 


where R (£, y, 2, ---) is a rational function of its arguments 
z, Y, Z, ... goes into an integral of a rational function after the 
substitution az +b = t” with P suitably chosen (how must we 
choose p in the general case?). 

For example, the substitution 22 + 3 = t, 2dx = 6t dt yields 


í dx =| 31° dt =a ae 
V +3—2ý21+3 B =) E2 
Performing the division of t? by t—2 we find 


8 8 
ge Tl aes ge es 
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and hence we finally obtain 


Pee i | (#42444 Sp) ae 


= +324 12¢+ 24In|t—2|4+C=V 22+3+43,/ 22434 
+12% 2243+ 24In|,/22+3—2|+¢ 
The rationalization of an integral of the form 


n farto 
[R (z, V ZE) ae (v= 2) 8,/..) (25) 
where R (z, y) is a rational function is carried out by means of the 
substitution i 
b d-m—b 
aan" az+b=czi"+d-i", = 


Thus, integrals (23)-(25) in which R is a rational function of its 

arguments are always expressible in terms of elementary functions. 

+ Integration of Irrational Expressions Containing Quadratic 
Trinomials. Here we mean integrals of the form 


f R (z, V az? + bx-+c) dx (26) 


where R (z, y) is a rational function of its arguments. Such an 
integral can also be expressed in terms of elementary functions in 
all cases. In computing these integrals we apply trigonometric sub- 
stitutions. In order to do this we first complete the square and pass 
to a new integral: 


f R (a, Va FbrFc) dz= j R (x, V £ (ke TE m?) dz 


where k, 1 and m are constants. After that we use one of the 
following substitutions: 


‘kx-+-l=mtant for the radical V ke F F m 
ke+-1l=msint for the radical y Z (kzl) m? and 


kz+l= 5 for the radical V (kr 1 — m? 
(of course, we cannot have the case V —(kx + 1)? — m? for real 


integrals). The substitutions enable us to extract the roots (check 
it up!) and thus we come to an integral of the form 


f Ri (cost, sin t) dt (27) 


where R, (z, y) is another tational function of its arguments. In 


on 10 we shall describe methods of computing an integral of form 
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We sometimes use the hyperbolic substitutions 
kz +1l=msinht, ke+l—=mtanht and 
kz + l = m cosh t 
There are certain direct methods that can be applied to computing 


integrals of form (26). For instance, we can often pass to an integral 
of the form 


Pp (2) 2 

f ene (28) 
where P, (z) is a polynomial of the nth degree. Such an integral 
can be easily found with the help of the method of undetermined 
coefficients. Let us show that the integral can be represented in 
the form 


Qn- (2) Var FBFE +k | re (29) 


where Qn; (z) is a polynomial of the (n — 1)th degree and K is 
a constant, The last integral is readily found (see the end of Sec. 5). 
For definiteness, let n = 3. Equating (28) to (29) we obtain 


f arst Bat tyr tô go _ (Ax? Br+C) V ar tbtt e+ 
Vart +br+ce 
dr 
aes f Vart + bre (30) 
where all the coefficients on the left-hand side are given and the 
coefficients A, B, C and K should be determined. In order to find 
them let us differentiate equality (30): 


astha tyt _ (2424 B) V a2 For e+ 
Vart }br+e (@Az+B)V 
2az+b K 


+ (Aa* + Bat 0) apay VATT: 
ar? Ba®-+yr+6=(2Az+8B) (a+ bere) + 
44 (41? +B2+0) (2az+b) +K 


Equating coefficients in equal powers of T, that is coefficients in 
23, 22, 2! = x, 2? = 1, we find, in succession: 


3aA=a | 

SpA+2aB=8 

A+ dB+aC=y 
cB+>0C+K=6 
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We have a0 and therefore A is easily found from the first 
equation. Substituting the value of A thus found into the second 
equation we determine B etc. Consequently, we thus can determine 
all the coefficients A, B, C and K which justifies formula (30). 

For instance, let it be necessary to compute the integral 


TE j V 28—22+1dz 


In order to do this we write 


222?—2xr+1 
2x? — 2 =e 
| V 282244 a+1dx E eam ie 
= dx 
=(Ar+B) V 222— pee 
(Ax +B) V 2a T |, 
222—22+14 
V2 221 
=AV 2822 1+(A ae ss 
ee TERA E T EE Fi Ve aTi 


Hence, we have 
—2r+1= A (2229-22 +1)+ (A14 B) (2z—1)+K 
and therefore 
4A=2 


—8A+2B=—24 which yields A=1, B Seok 
A—B+K=1 3 


Thus, we obtain 


est Bene E S ONAA a(<—3) 
yam ed PT 
=n |p—z+V (2-4) 42 (7-5) )'+4|+¢e= 


22—1 222 —2e--4 
=m T Kte ZE 


=g l22—14V a Fi|+C, 
where C,= -7224 C. 
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Finally, 
= (2-7) V dat — 22-1 +— In| 22—1 4+ 22? — 22414 | 
Towa 1v2 EOR ae 
+0, AE A 


Of course, we can simply write C instead of C, in the final answer. 
The integral 


dx 
| evar: (n=1, 2, ...) (31) 


can be reduced to integral (28) by means of the substitution z — œ = 
Í. Hence, after the substitution has been carried out we can 


apply the method of undetermined coefficients. The method can 
also be applied to an integral of the form 


Aes emer ae 
\re eran ad (82) 
where P (z) and Q (z) are polynomials. Indeed, if we decompose 
the fraction = into an entire part and a sum of partial rational 
Gua [see formula (VIII.37)] then integral (32) 

t—a 


fractions of type 
breaks into a sum of integrals of forms (28) and (31). 

9. Integrals of Binomial Differentials. A binomial differential 
is an expression of the form 


(ax” + b)? x dx 
which enters into an integral of the form 


i= f (ax" +b)?a™ dx 


that we are going to consider here. The numbers n, p and m in the 
expression are rational numbers, that is they are integers or rational 
fractional numbers. In 1730 Christian Goldbach (1690-1764), a Rus- 
sian mathematician, indicated three cases when the integral of 
a binomial differential can be expressed in terms of elementary 
functions: 5 

1. The number p is an integer. li p > 0 we should simply remove 
the brackets and perform termwise integration. If p< 0 we must 
make the substitution z = t* and choose k in such a way that all 
the exponents become integers. This being always possible, 
we thus get an integral of a rational function. 
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2. The number mes is an integer. Substitute az” + b = u. 
Then we get 
1 AN 
pp [ub mgm 1 fu—by\ n 2- 
I= fu (=>) ] = ( j ) "7% 


The exponent ais 
ceding case 1. = 
3. The number a : +p is an integer. Here we use the sub- — 


stitution az” + b = ux” which again yields case 1. We leave the 
calculations to the reader. = 
It was only in 1853 that the prominent Russian mathematician 
P. L. Chebyshev (1824-1894) proved that the integral of a binomial — 
differential cannot be expressed in terms of elementary functions | 
(see Sec. 11) except for the three cases enumerated above. x 
10. Integration of Functions Rationally Involving Trigonometrie 
Functions. Here we shall deal with an integral of the form ; 


—1 being an integer, we thus arrive at pre- 


f R (sin x, cos z) dz (33) 
where R (u, v) is a rational function in u and v. Such an integral 
is always expressible in terms of elementary functions. To prove 
this let us make the so-called universal substitution tan > =t. 
Then we have 


(vertfy the calculations!). Hence, integral (33) reduces to the inte- 
gra 


2t 1—2 2 
fR (fa: Tra) Ta a 
where the integrand is a rational function of ¢. The last integral can 
be found by means of the method of Sec. 6. 
The universal substitution (34) often leads to very complicated 
expressions containing rational fractions and it is therefore pre- 
ferable to avoid it in problem-solving practice. In certain particular 


cases it is better to use some other substitutions which we are going 
to consider here. : z 
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1. Let the integrand in integral (33) be an odd function with 
respect to sin z, that is let R (—sin z, cos z) = —R (sin z, cos z). 
Then we can write 


f= f R (sin z, cosx) dz = f Risin z, £8) sin zdz = 


= f R, (sin z, cos z) sin z dx 


where R, is an even function with respect to sin z. R, being a ratio- 
nal function, we can easily express it in terms of sin? z and cos z. 
It follows that 


Diez f R, (sin? z, cos x) sin zdr = — f R,(1— cos? z, cos x) d cos x 


and therefore if we put cos z = ¢ we arrive at an integral of a ratio- 
nal function. 

2. Similarly, if the integrand in (33) is an odd function with 
respect to cosx then the substitution sin z = ¢ rationalizes the 
integral. 

For example, 

sin? z dz ne sin? z cos z dz 
| a rT f cos? z (cos? z—2 sin z) 


Putting sin z=tż, coszdz=dt we derive 
sin? z dz J j fat 
| nT *(i—#) (4—#®#— 22) 


The last integral is readily found if we decompose the integrand 
into partial fractions or if we take advantage of the equality 


p= za —f)—( E 

3. If the integrand does not change its value when we simulta- 

neously change the signs of sin z and cos z, that is if 
R (—sin z, —cos x) = R (sin z, cos z) 

then we can apply the substitution tan z = ¢ (or cot z = t). We 
can easily verify that this yields the rationalization of the integral 
in the general case but we are not going to do this here because 
in every concrete example the advisability of the substitution is 


confirmed by the results of the calculations. 
Take an example: 1 


danin de 
sin? zcosîz ) tan? z cos’ z 
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4 dx 
1-2’ coz 


j fe pia l (#42 +4) dt= 


Putting tan z= t, Cos? z= dt, we complete the inte- 


gration: 


sin?2costa 
B 


=24%—14C= tans} 2tan xz—cotxz+C 


The same integral can be computed if we represent it as 
j (sin? z+ cos? z)? 


sin? x costs 


and remove the brackets in the numerator (check it up!). 
Let us separately consider integrals of the form 


j sin” x cos" x dx (35) 


where m and n are arbitrary integers of any sign. In case m is odd 
the integral belongs to case 1 considered above, and thus it can be 
found by means of the substitution cos z = t. If n is odd the inte- 
gral belongs to case 2. Finally, if both m and n are even we have 
case 3. But the calculations can sometimes be simplified. For in- 
stance, if m > 0 and n > 0 and if both m and n are even we Can. 
apply the formulas 


fe 1—cos 22 i “Fae 4-++cos 2x 
SN oe sin z cos x= sin 2x and cos? z= SS 


For example, 
j sin? x cos’ x dx = f (sin z cos x)? cos! x dz = 


=5 \ sin? 2x (4 + cos 22)? dx = 


ee id sin 4z 1 sin’ 2x 1 
= 35 (2— y )+,5- +75 | (1 — cos 8x) dz = 

es Lie sige 122) 

= Jog * — jag Sin 4x + gg Sin? 22 — ggg sin 8a +C 
The same result can be obtained if we express the trigonometric 
functions in terms of exponential functions by using Euler’s for- 
mulas (VIII.411). 

We sometimes perforni integration by parts in computing inte- 
grals (35) in order to reduce the positive exponents and increase 
the negative exponents in powers of sin z and cos z. For example, 
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we can put cosz=u and dv = Se das (that is 
: sin’ z 
du = —sin x dz and v = — yam) when integrating the func- 
r cos? z 
tion =~: 
sin’ x 
cos? z Se FP Cope SS m Tas 
sinz  2sin?z | 2sinz 
Now, making the change of variable tan St we obtain 
cos? x pei cos x ened 2dt (4-+-#)-1 __ 
sins * = 2sin?z) 2 J 2t(4 2) 
cos © 1 cos © 


1 x 
a — zme g "lile -r g | tan z|+¢ 
[here we have utilized formula (34) in transforming the second 
integral]. 

11. General Remarks. Since integration is a much more compli- 
cated procedure compared to differentiation the reader must care- 
fully study the basic methods of integration. But, on the other hand, 
it is inexpedient to carry out complicated calculations every time 
when it is necessary to compute an integral. It is therefore advisable 
to use reference books in which the most widely encountered inte- 
grals are collected in orderly way. In particular, we refer the reader 
to [7], [19] and [46]. 

Many important integrals are not elementary functions, that is 
they cannot be expressed in terms of finite combinations of the 
simplest elementary functions which are studied in elementary 
mathematical courses. For instance, the integral 


1 

f a 2+1 dr= f (2+1)? dx 
belongs to the type considered in Sec. 9. But since here we have 
n= pa + and m = 0 the integral cannot be reduced to those 


three cases we studied in Sec. 9. Similarly, the integrals 
f sin z -4% dx | 


f e*t. de N ie Rea) 


f cos z-z% dx 


are not expressible in terms of elementary functions, and therefore 
all the integrals that can be reduced to these integrals cannot be 
expressed in terms of elementary functions either. Examples of 


such integrals are ; 


1 f evn ?du 


f e? dr=5 
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du aa 
here }> 


2Vu 


(we have applied the substitution z? = u, dx = 


1 


: f sinuu? du (22 =u) 


f sin z? dz = 


4 


P 
1 Er 2 
y | cosu-u du; (2 =u) 


f cos z? dz = 
and 
paci 4 ab 
f V sin z dz = f u? (1—u?) ? du 
(the last transformation is carried out by means of the change of 


variable sin z = u, dx = ere this results in the integral on 
the right-hand side which belongs to the type considered in Sec. 9 
fon = 2; p=—yandm=3). 


There are wide classes of such non-elementary integrals. For 
instance, as a rule, integrals of the form 


f R (z, V Pn (2) dz (87) 


where R, as before, is the sign of a rational function and P, (2) 
is a polynomial of degree n > 3 are not expressible in terms of 
elementary functions. 

In the past the fact that certain integrals cannot be reduced to 
elementary functions was thought of as a catastrophe. But now we 
can easily overcome such difficulties. First of all, there are extensive 
tables of many important non-elementary functions in terms of which 
very many integrals that cannot be reduced to elementary functions 
are expressed. In Sec. XIV.12 we shall give examples of such non- 
elementary functions (special functions) to which all the integrals 
of form (36) can be reduced. Integral (37) for n = 3 and n = 4 is 
called an elliptic integral. Such an integral can be expressed in 
terms of the so-called elliptic functions which are thoroughly inve- 
stigated. These methods and formulas can be found in the reference 
books we mentionéd above. We also refer the reader to [23]. 

Besides, at present the techniques of computing integrals have 
become so perfect that the investigation of a function represented 
in the form of an integral is not more difficult than the investigation 
of a function represented directly, without the integral signs. There- 
fore now even when we encounter an integral which is expressible 
in terms of elementary functions but which involves very compli- 
cated combinations of them we usually prefer to deal with the inte- 
gral representation of the function. 


. CHAPTER XIV 


Definite Integral 


In solving many important problems we have to sum up an infinite 
number of infinitesimal summands. This leads to one of the basic 
concepts of mathematics, namely to that of the definite integral. 
Essentially, it is this concept to which all the methods of integration 
presented in Chapter XIII are applied. 


§ 1. Definition and Basic Properties 


1. Examples Leading to the Concept of Definite Integral. Let us 
consider a problem which is reverse to the one considered in the 
end of Sec. IV.4 that led us to the concept of a derivative. Namely, 
let us regard the law of variation of the instantaneous velocity of 
a material point v= v(t) as known and calculate the path 
length covered during a period of time from t=a to t= B. 

Since we do not suppose that the motion is uniform we cannot 
compute the path as the product of the velocity by the time taken. 
We shall therefore apply the following procedure. Let us divide 
the whole time interval into a large number of small subintervals 
of time which may not be equal to each other: 


to =, Soy (Ae nk N Tt ae Ga KES n SP 


where f: .., fn- are some intermediate instances of time which 
are chosen arbitrarily. If these subintervals are sufficiently small 
we can regard the motion as being uniform during each of the subin- 
tervals without making a considerable error. Hence we can put 
down the following approximate expression of the path: 


sw UV, At, + VAt +... + At (1) 


Here v, (k = 1, 2, .. + n) is one of the values of the instantaneous 
velocity v attained on the kth subinterval of time, i.e. vn = v (ta) 
where tna < Ta <t, and At = ta — th- is the length of the 
subinterval. [The reader should pay attention to the difference 


27—0141 
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between the notation introduced here and the one used in § V.2 
where we had t — tr-1 = Atp- and v, = v (t,).] Hence, formula 
(4) can be rewritten in another form: 


n 
8 >) u(tr) Ath (@= aL ...<tn=B, thi<ta<tn) 
k=1 


The smaller the subintervals of division of the original time inter- 
val, the greater the accuracy of the formula. To obtain the exact 
formula we must pass to the limit assuming that the partitions of 
the original time interval are chosen in such a way that the lengths 
of the subintervals tend to zero: 


s=lim X) v (Ta) Atr (2) 
k=1 


Similarly, if in the second example given in Sec. [V.1 concerning 
the problem of filling the vessel we regard the velocity of filling 
w = w (t), which can be variable, as known then the total volume 
V filled during the time period from « to ĵ is equal to 


n 


V= lim x W (Th) Atr (3) 


k= 


where the notation is understood in the same sense as before. The 
reason for putting down formula (3) is essentially the same as for 
formula (2): in calculating the volume V we can regard the rate 
of filling as being almost constant during any small time interval 
or, more precisely, the rate can be regarded as being constant during 
any infinitesimal time period. 

Let us turn to the third example considered in Sec. IV.1. If we 
regard the linear density of the thread at each point s as given, that 
is if the function p = p (s) is known, then, after a manner of the 
previous examples, we can write the following expression for the 
total mass of the thread: 


n 
aa PA) Asr (a=H<5<... < Sn =P, Sr- LOr < Sr) (4) 


Here œ and 6 are the values of s corresponding to the ends of the 
thread, and the limit is taken in an imaginary process in which 
the subintervals of partitions are decreased infinitely. 

Finally, let us consider an important geometric example. Let it 
be necessary to compute the area of the figure which is shaded in 
Fig. 244. For simplicity’s sake, we shall suppose that f (z) > 0. 
Such a geometric figure is called a curvilinear trapezoid. Let us 
divide the whole interval « < z < B of variation of z into small 
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subintervals by means of the points of division tọ =a<ay< 
<La L.n L Ena <L In = P, and let us approximately regard 
the altitude of the geometric figure based on each of the subintervals 
as being constant and taking on a certain value f (E,) where £r-1 < 
< En < zp. Then we can put down an approximate expression for 
the area of the curvilinear trapezoid, namely 


The geometric meaning of the right-hand side of the last formula 
lies in its being equal to the area of the “step-like” figure depicted 
in Fig. 244. The figure is obtained from the curvilinear trapezoid 


\. “sh ~ wea 


LAAT 
by replacing each of the z trapezoids (which are the parts. Ms 
trapezoid is divided into) by a rectangle having the same base and 
an intermediate altitude. Passing to the limit as the subintervals 
are infinitely decreased we obtain 


S=lim 2 Í (Ex) Az, (5) 


2. Basic Definition. Expressions (2)-(5) which arise when we are 
solving various problems are of the same structure. Similar expres- 
sions are encountered in solving many other problems. All this 
confirms the expedience of the following general definition. 

Let a function f (x) be defined over «a <x < p. Let us divide the 
interval in an arbitrary way into small subintervals by points of 
division 4 =a<%y<%<.-- <%= 6, and write an inte- 
gral sum of the form 


È Ener Ar 1G) Ant. + FE) Am O 


27* 
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where each of the points &, is arbitrarily chosen between 2-, and 
2, that is somewhere on the Ath subinterval, and Az, = x — 2p-1- 
Now let the lengths Az, be infinitely decreased; then the limit to 
which the integral sum tends in this process is called the definite 
integral of the function f (z) taken over the interval of integration 
a<a<f. The definite integral is denoted as 
B n 
\ f(x) dz = lim. J) f (x) Ace (1) 
a k=1 
Accordingly, in the examples considered in Sec. 4 we obtain, 
respectively, 
B B B B 
s= |v d, V= | wdi, M= | p(s)ds and S= \ f(x) dx (8) 
a a a a 
_ The last equality implies the geometric meaning of the definite 
integral in the case when the function y = f (z) (the integrand) is 


Fig. 245 


positive: in this case the integral is equal to the area of the curvi- 
linear trapezoid bounded by the graph of the function, the axis of 
abscissas and the straight lines parallel to the axis of ordinates 
passing through the end-points of the interval of integration. The 
end-points, that is the numbers œ and f, are called, respectively, 
the lower limit and the upper limit of integration. The expression 
f (x) dx is called the element of integration. 

lf the integrand is negative or changes its sign then some terms 
entering into integral sum (6) will be negative. Therefore after 
passing to the limit we see that the integral is equal to the algebraic 
sum of the areas of the parts of the curvilinear trapezoid which lie 
over and under the z-axis (see Fig. 245). The areas of the parts 
lying over the z-axis are taken with the sign + and those under the 
a-axis are taken with the sign —. 

Comparing formulas (8) we can also conclude that in order to 
calculate the path length covered by a point in its rectilinear 
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motion, for a given relationship between the velocity and the time 
taken which is represented by a graph (see Fig. 246), it is sufficient 
to compute the area of the corresponding curvilinear trapezoid. 
In this example we must also take the area with the sign — if v < 0, 
that is if the graph lies under the t-axis, because the increment of 
the coordinate of the moving point is negative in such a case. This 
rule of signs for computing an area holds for a great number of 
other examples. 

Let us dwell in more detail on the passage to the limit in for- 
mula (7). The limit is sometimes said to be taken as n —> co but 
this is not precise because we do not suppose that the subintervals 
Aa, are of the same length and therefore if we limit ourselves to 


the condition that only n — oo then we can encounter a case when 
the subintervals belonging to one part of the interval a < £ < B 
decrease whereas the others do not. It is therefore better to say that 
the limit is taken while the lengths of the subintervals are infinitely 
decreased. The degree of the decrease can be characterized by the 
largest of the lengths Az, of a given partition because if the largest 
length is small then the other lengths are automatically small. 
Hence, we can say that the passage to the limit in formula (7) is 
performed in a process in which max Az, —> 0. 


k i 
Let us consider an example of calculating a definite integral on 
the basis of its definition (7). Let it be necessary to compute the 
1 ; 
integral i z? dx. Divide the interval of integration into five equal 


parts of length 0.2. For definiteness, let us choose a point on each 
of the subintervals at its left end-point. Then & = 0.03.85. = 0:2; 
E, = 0.4, & = 0.6, & = 0.8 and 

1 5 

| ordam Ji BAr = (0.004+0.2°+0.4 +0.68 +0.8)-0.2 =0.24 


0 k=1 
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An analogous division into 10 subintervals would yield the value 
0.29 and the division into 100 equal subintervals would yield the 
value 0.33. (In Sec. 3 we shall find the exact value of the integral — 


which turns out to be equal to + ; thus the above value 0.33 is very 


close to the exact value. We suggest that the reader should elucidate 
the fact that the approximate values obtained in our example are 
oN than the exact value, by means of a graphical construc- 
tion. 

We see that there is arbitrariness in forming an integral 
sum because it depends both on the choice of the points of division 
z, and on the choice of intermediate points £p. But nevertheless 
if we take a partition whose subintervals are sufficiently small the 
sum will be practically equal to its limit, that is to integral (7) 
which, of course, depends neither on the points x, nor on the points 
En. Each of the summands entering into an integral sum becomes 
very small when we take a partition in which Az, are sufficiently 
small, the smallness of the summands being implied by the small- 
ness of Az,. But at the same time the number of summands becomes 
so large that the whole sum has a finite value. Roughly speaking, 
if the number of summands entering into an integral sum is equal 
to n then each Az, (and therefore each of the summands) is of the 


1 A 
order of = whereas the whole sum is of the order of n- = = 4.83 


it is finite. Thus, taking into account that we pass to the limit in 
formula (7) we can say that the definite integral is a sum of infinitely 
many infinitesimal summands. Practically, we can often regard 
a definite integral as a sum of a great number of very small homoge- 
neous summands (that is the summands of the same dimension, of 
the same character, of the same sense etc.), the summands being 
so small that the sum is practically equal to its limit. Such an 
approach completely corresponds to the practical concept of infinite- 
_ly large and infinitely small quantities which are understood (see 
Secs. III.1 and III.3) as quantities that are, respectively, very large 
and very small but finite, theoretically. It should be noted that 
not every sum of infinite number of infinitesimal summands yields 
an integral. Indeed, as we have seen, for such a sum to assume a finite 


value, the number and the magnitude of the summands should be 
coherent in a certain sense. 


The interpretation of the integral as a sum accounts for its nota- 


tion. In fact, if we regard the summands entering into sum (6) as 
infinitesimals and denote the lengths of the subintervals as Ax, = 
= dz then the whole sum (6) can be put down in the form 


(to x=) 
f (x) dx 


(from x=z) 
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At first sums were denoted by the letter S. Then it was gradually 
lengthened and this resulted in modern notation (7). 

In conclusion note that an integrand can be either a continuous 
function on the interval of integration or a discontinuous one, that 
is having points of discontinuity. But in § 1 we impose the condi-, 
tion that the interval of integration is finite and that the 
integrand does not approach infinity on the interval. It can be 
shown that under these assumptions (and under some additional 
requirements) the definite integral exists, that is it has a finite 
value. A rigorous proof of this assertion which is not based on phy- 
sical or geometric considerations can be found in more comprehen- 
sive courses on mathematical analysis. As we shall show in § 4, 
the integral may not have a certain numerical value if the above 
conditions are violated. 

3. Relationship Between Definite Integral and Indefinite Integral. 
We begin with a simple remark that a definite integral does not 
depend on the notation of the variable of integration, i.e. 


B B B 
| fe)dz= | feoae= | feyas=... (9) 


a a a 


Indeed, for example, this is implied by the fact that all the inte- 
grals put down above are equal to the same area. Thus, the variable 
of integration in a definite integral is a dummy variable similar 
to an index of summation (see Sec. III.6) and it can be denoted by 
any letter or symbol. n y 

Let f (x) be a function that we are going to integrate. But let 
the lower limit alone (denoted as zo) be fixed, and the upper limit 
(which we denote by z) be arbitrary, i.e. variable. Then the value 
of the integral itself will depend on z, and we can therefore denote 
it as Ọ (z). Hence we can write 


x 
O (x)= f f(z)dz (x)= const) 
žo 
or, taking into account equalities (9), 
x 
®(1)= f f(t)dt (zọ= const) (10) 
Xo 


The first form of writing may sometimes lead to misunderstandings 
because the letter z entering into it is simultaneously understood 
in two different senses, namely as the variable of integration and 
as the upper limit. Therefore, although the first form is admissible, 
the second is preferable. 
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Now let us prove that the function ® (z) thus constructed is an 
antiderivative (see Sec. XIII.1) of the integrand f (z), that is 


4 jrou=(] sat)’ -rw 


. xo 
Thus, we assert that the derivative of a definite integral with 
respect to the upper limit is equal to the value of the integrand for 


J 


Fig. 247 


the value of its argument equal to the upper limit. To prove the 
assertion we first suppose that f (x) is a continuous function and 

- consider Fig. 247. The geometric meaning of the integral implies 
; that if z gains an increment Az 

then A® is equal to the area 
shaded in Fig. 247. This area is 
approximately equal to the pro- 
duct Az-f* where f* is an inter- 
mediate ordinate equal to one 
of the values of the function 
taken between the points z and 


z + Az. It follows that =< = 
= f* = f (x*). Hence, if Az— 0 
then z* —> x, and therefore pas- 
Sing to the limit we obtain 


7 : AQ $ 
D (2) = lim Fe = jim f(z") = 


= f (z) 


which is what we set out to prove. 
‘ t In particular, we see that a 
continuous function. always has an antiderivative (see Sec. XIII.1). 
To find one of the antiderivatives we can evaluate the definite 
integral of the given function for a fixed lower limit and regard it 
as a function of its upper limit. é 


Fig. 248 
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Now suppose that the integrand is discontinuous (but finite since 
for the time being we consider only finite functions). Then function 
(10) is continuous at the points of discontinuity of f (z) but the 
derivative of ® (z) has jump discontinuities at these points. Hence, 
the graph of ® (x) is “broken” at such points (see Fig. 248). We 
extend the notion of an antiderivative when we admit these “breaks” 
because at a point of this kind there is no single value of the deri- 
vative. But this natural extension enables us to say that each func- 
tion that is finite everywhere has an antiderivative which is a con- 
tinuous function. 

Now suppose that we have to evaluate the integral 


B 
I= \ f(a)dz 


a 


and that we know one of the antiderivatives of the function f (x) 
x 


which we designate as F (z). The function j f (t) dt also being an 


antiderivative of f (z), we have, by Sec. XIILA, the relation 


x 

J f@at=F (+e 

a 
where C is a constant. Putting here r = œ we see that the geometric 
meaning of the integral implies that the left-hand side of the relation 
vanishes, that is we have 


0=F(a)+C, C=—F(a) and J 0 a= F (@)—F@) 


Putting z = in the last formula we obtain, on the basis of 
relation (9), the formula 


IB 
| 1e) de= F @)—F (0) (11) 


Thus, a definite integral is equal to the increment of an antideri- 
vative of the integrand corresponding to the variation of the inde- 
pendent variable from the lower limit of integration to the upper 
limit. The right-hand side of formula (14) is also designated as 
F (a) |È where [Ê is the sign of double substitution which means that 
the lower limit and the upper limit must be substituted for the 
argument into the function and then the first result subtracted from 


the second. 
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Another way of writing formula (11) is 


f Í (z) dz= ef f(a) dz) I (12) 


a 


Formula (12) is justified by the fact that 


(irod) f =F =F 6) +C1-1F @) +C1= 
B 
=F G)—F (@)= | f (2) dz 


where F (x) is one of the antiderivatives and C is a constant (see 
Sec. XIITI.1). 

Thus, a definite integral is equal to the increment of the corres- 
ponding indefinite integral. This result is one of the most important 


theorems in mathematics. It is called the Newton-Leibniz theorem. 
Let us take an example: 


|a= (Jas) (alig 


It should be noted that when evaluating the indefinite integral here 
we have not written the arbitrary constant C because, as it was 
shown, the terms +C and —C always cancel out. 

We see that if the limits of integration are given a definite integral 
is a constant number whereas the corresponding indefinite integral 
is a function. 


Up to now we assumed that æ < . Let us extend formula (11) 

to the case œ > 6. This means that for œ > ß we regard formula 

(14) as the definition of the*integral written on the left-hand side. 
Since f (x) = F’ (x) formula (11) can be rewritten as 


R. 


[r (z) dz = F (B)—F (a) (13) 


a 


Hence, the definite integral of a derivative is equal to the incre- 
ment of the antiderivative. 

4. Basic Properties of Definite Integral. 

1. The interchange of the limits of integration yields the multi- 
plication of the integral by —1. Actually, by formula (11), we have 


a B 
| f(e)de= F F G= IF @)—F (o) = — | 1 @)az 
B 


a 


DEFINITE INTEGRAL 427 


This simple property can also be put down in the form F (x) = 
= — F (x) |ġ which enables us to substitute the limits in reverse 


order if we change the sign of the indefinite integral beforehand. 
For instance, 


In particular, property 1 implies the following rule of differen 


tiating an integral with respect to its lower limit: 
x 


« (Ñ roa)- (Jroa = -4 (fsa) =-10 


2. If the limits of integration coincide then the integral is equal 
to zero, i.e. 
a 
j f(a) dx=0 
a 


Property 2 has been already used (see Sec. 3). 
3. The theorem on “partition of the interval of integration”: 


B X CA 
f (z)de+ \ f(x) dr = f (x) dx 


for any œ, B and y. In fact, the left-hand side is equal to 
LF (B) — F (a)i + UF w) — F (B) = F 0) — F @) 


X 


which equals \ f (a) dz. 


4. The integral of a sum of functions is equal to the sum of the 
integrals of the summands (the same is true for the difference) 


8 B B 
| (a) a (alde= | fle) de + | 9 (0) ae 


To prove the property we apply the analogous property of indefinite 
integrals (see Sec. XIII.9) and equate the increment of the right- 
hand side to the increment of the left-hand side as x varies from g 
to B. The following property is proved in a similar way. 

5. A constant factor can be taken outside the sign of the integral: 


B B 
į Mf (x) de=M j j(e)dx (M= const) 


a a 
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Properties 4 and 5 can be formulated simultaneously as “a definite 
integral is linear with respect to the integrand”. 

The term “linear” is understood here in the sense of Sec. X1.6. 
Namely, the formula a 


6 
fto dz=I (14) 


for fixed œ and B determines a correspondence betwéen the finite 
(integrable) functions defined over « < z < f and real numbers J, 
that is to each function there corresponds a certain number T. 
In other words, formula (14) defines a mapping of the infinite-dimen- 
sional linear space of such functions into the one-dimensional space 
of all real numbers. Properties 4 and 5 are then nothing but the 
condition that the mapping is linear. (For instance, let the reader 


verify that, for « = 1 and ß = 2, the number J = 4 corresponds 


to the function y = z? and the number ʻ corresponds to the func- 
tion y = 5 whereas the number 5- £3. 4 = 10.54 corresponds 


to the function y = 5z? ani A rule, a law, according to which 


to functions there correspond numbers, is called a functional. 
Hence, formula (14) determines a linear functional defined over the 
above functional space. 

6. The formula of integration by parts 


| wv’ dz = (w) E -| u'v dx 


is also deduced from the corresponding formula for indefinite inte- 
grals, namely from formula (X111.13). 
Take an instance: 


7 


| zsinzdz= (—z cosa) |" 4 
0 


cos z dz = 


Otm H 


=(—«cos2)|F + (sin z) |f =x 


(here we have put u =a, dv = sin z dz, du = dz and v = —cos 2). 
_ 7. The formula of integration by change of variable in definite 
integrals is obtained if we equate the increments of both sides of 
formula (XIII.15) corresponding to the variation of ¢ from œ to b. 
Doing this and taking into account that the variable z which equals 


DEFINITE INTEGRAL 429 


@ (t) varies from ọ (g) to @ (Bp), we obtain 


B 
z=) [ f f(z) dz | 


| fle le’ @ae=[ | f@az] 
Now, taking advantage of formula (12), we finally derive 


x=p(a) 
a 


B o (B) 
froe @at= f F(a) dz 


9 (a) 


Hence, in applying the formula we should additionally change the 
limits of integration. To do this we must find an interval which 
should be run by the new variable so that the old variable of inte- 
gration should vary over the interval that was originally set for it. 

For example, if we want to make the substitution z = R sin ¢ 
in order to evaluate the integral 


we must take into account that for x to vary from 0 to R, it is suf- 
ficient that ¢ should run from 0 to $ . Therefore 


i 
cost at = F> j (1 + cos 2¢) dt = 
0 


ew Z wR? 
2 Bray ye 


As we see, in contrast to the change of variable in an indefinite 
integral, the inverse substitution, that is the transition from the 
new variable to the old one in the final answer, is not needed in 
computing a definite integral. We suggest that the reader should 
construct a geometric figure whose area is expressed by the above 
integral and, in addition, obtain the same result by means of the 
substitution « = R cos £. j 

We have deduced properties 3-5 of the definite integral on the basis 
of formula (11). But the same could have been achieved with the 
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help of definition (7) which introduces the definite integral as the 
limit of the corresponding integral sum. For instance, passing to 
the limit in the formula 


2, Gi) E p E] Aen = D fE Are é D o E) Aer 


as the subintervals of the partitions of the interval œ < z < fp 
tend to zero, we receive property 4 etc. Property 8 is implied by 
the same definition. 

8. If the variables in question have certain dimensions then 


B 
| | fœ dz |= 11-12) 


a 


since the operation of summation and the operation of passing to 
a limit do not change dimensions. 


y 


Sa g ax 


Fig. 249 


_9. There are certain cases when the integration in symmetric 
limits can be simplified. Namely, as we see in Fig. 249, we have 


a 


| 102) de=2 | f(a) az 


0 
for every even function f (x), and we have 


j f(x)dz=0 
for every odd function f (z). 
10. An integral of a periodic function taken over an interval 


whose length is equal to the period of the function does not depend 
on the position of the interval on the axis of the variable of integration. 
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In other words, if f (x + A) =f (z) then the integral 


is independent of x. Indeed, according to the rule of differentiating 
a composite function, and on the basis of the formulas for the deri- 
vatives of-an integral with respect to its lower and upper limits, 
we get 


H jf (a4 A) EA j (a) F = etA) (2) =0 


(Let the reader prove the property by taking advantage of the geo- 
metric meaning of the definite integral.) i 

In conclusion, let us consider several examples of incorrect eva- 
luation of definite integrais. 


2n ; 
A. | Viqear=| Vi cos? t (— sin t) di = 
x 


2x 2n 
g — t 
= f sintea—— |" a= 
T x 
IRAAN E: t sn) PA ANE 
TeV ( ita x 2 


(we have performed the substitution z = cos é m < t< 21). 
The result is apparently incorrect since an integral of a positive 
function taken in the positive direction (that is from a smaller 
limit to a larger one) must be positive. The mistake lies in the re- 


placement of sin? t by sin t whereas it should have been replaced 
by | sin ż | (see the end of Sec. 1.5). Actually, we have sin t < 0 


for n < t< 2m and hence sin? t = —sin ¢ for such ¢. If we 
took sin? t = |sinż| we should obtain the correct result 
1 


E E 
| VIZ fdr =. 
Zi 

We sometimes encounter integrals of the form 


b 


fIr@ ldz 


a 


in analogous situations. These integrals can be treated in the follow- 
ing way. We begin with determining the intervals of retention of 


432 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


the sign of the function f (x) (see Sec. 111.15). For instance, let 
f(z) >0 for a<a<e, f(z) <0 for c<a<d and f(z) >0 
for d< «<b. Then 


[æla f iroa iroa] pras 
a 3 “i 
=f rae | rea Í aaan 
a c a 


etc. 
2. The correct value of the integral Z = \ z? dz is 3 which is 
=4 


° 
obtained without any substitution: 


But the following calculations are incorrect, and their result con- 
tradicts the correct value J = 3: 


-4 1 


2 4 i a : 

2 2 ard WA ae AM PN R SAt 
| edr= | togt | Vide 5 
i 7 


(we have made the change z? = t, i.e. z = V i. The mistake lies 


in the fact that the formula z = V ¢ of transition from z to ¢ makes 
no sense for z < 0. Therefore, if there are reasons that make it 
necessary to perform the substitution z? = ¢, we must break the 


integral into two summands according to the formula \ z? dz = 
a 


0 2 
= | at de + { a dz and then put z = — Vt (1 > t> 0) in the 
4 0 


first integral and z = Vt (0<t <4) in the second integral (let 
the reader do it!). 


3. As in example 1, the following result is apparently incorrect: 


-1 |2 marna pak} E 3 


-1 =14 2 


Indeed, the integrand approaches infinity at z = 0, and we cannot 
therefore apply the Newton-Leibniz formula here. We shall discuss 
integrals of this type in Sec. 16. 
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5. Integrating Inequalities. The definition of the integral and its 
geometric meaning (see Sec. 2) imply that 
B 
if f(z)>0 and a<f then f f(@)de>0 (45) 
a 
The last inequality turns into the equality if and only if f (z) =0 
on the interval œ <a <f in the case of a continuous function 
f(x). But if we consider discontinuous integrable functions as well 
then the integral of a function which is different from zero at a finite 
number of discrete points is nevertheless equal to zero because such 
points do not affect the value of the integral. 
If there is a condition 
g(a) <p(e) for a<z<p (16) 
then putting p (z) — ọ (z) =f (x) and taking advantage of assertion 
(15) we deduce 
B B B 
| ((@)—e@]de>0 and Y (2) dx— i (2) dz>0 
a a 


a 


Hence, we have 


B B 
j q (2) de< | 1p (a) dz (17) 


Thus, inequality (16) implies inequality (17) which means that 
the sign of inequality is retained when we integrate an inequality 
in the positive direction. (Think about the changes that must be 
made in the assertion if an inequality is integrated in the negative 

` direction.) ; 7 

As above, if inequality (16) holds then inequality (17) turns into 
the equality if and only if ọ (z) = Y (2) for a < 7 S B in the case 
of continuous functions @ (z) and y (2) although in the case of dis- 
continuous integrable functions we can have the equality even if 
p(z) is different from p(z) at separate points. Vai be 

As a consequence of inequality (17) we obtain a crude estimation 
of the definite integral: let 


fmin <f (2) S fmax CEE) 
where fmin and fmax are two constants. Then integrating jthese ine- 
qualities we obtain 


B 
fmin @—a)< | f (2) d2< fmas B —0) (18) 
28-0144 
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In connection with this estimation let us consider the important 
notion of the mean value of a function which is also called “the 
arithmetic mean” of the function. If a function f (x) is regarded as 
being defined over an interval a < z <f then its mean value on 
the interval is a constant f such that the integral of the constant 
over the interval a < z < f is equal to the integral of the function 


B B 
taken from « to f. Thus, | Fax = Jf (2) da i.e. 
a a 


B 

| £@) de=F-G—a) 

a 
The last formula (the first mean value theorem) implies the following 
expression for the mean value: 


1 
p-—a 
a 
As would be expected, inequality (18) implies that 


i Tmin < f < fmax 

The geometric meaning of the mean of a function is illustrated 
in Fig. 250. We see that f satisfies the condition that the area of 
the rectangle AB’C’D is equal to 
the area of the curvilinear trapezoid 
ABCD. It is clear that if the function 
f (æ) is continuous it takes on the value f 
at a point belonging to the interval 
-æ Kz < FP (at the point y in Fig. 250)*. 
A discontinuous function may not assume 
its mean value. 

‘The advisability of the above defini- 
tion of the mean of a function is well 
seen if we consider an example of a func- 
tional relationship between the instan- 
j taneous velocity of a non-uniform motion 

Fig. 250 of a point and a current moment of time. 
3 The integral of the velocity over a time 
interval being equal to the distance passed [see the first for- 
mula (8)], we see that the mean value of the velocity on a time 


f (2) dx (19) 


mnnt 


f= 


* Hence, for a paient function f (z), the first mean value theorem is 
written in the form | (z) dz = f (y) (B — a) where y is a certain point belong- 


a 
ng to the interval a<a<f.—Tr. 
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interval is a constant velocity such that if the point moved 
uniformly with this velocity it would cover the same distance as 
in its non-uniform motion in question, during the same time inter- 
val. In other words, formula (19) implies that the mean value of the 
velocity on a finite time interval is equal to the ratio of the distance 
covered to the time taken. Hence, this notion agrees with the well- 
known notion of a mean velocity. The notions of a mean density, 
mean power etc. are also in agreement with the general notion of 
the mean value of a function. 

If a function is defined over an infinite interval, for instance, in 
the interval a <2 < oo, then its mean value is defined as 


B 

f= um ; j f(z)dx (y=const, a<y<o) 
B> B=7 y 

that is as the limit of the mean value corresponding to a finite inter- 
val in the process when the length of the interval is increased unli- 
mitedly. It is easy to verily that if the limit exists it does not depend 
on the choice of the value y (which can also be taken as equal to a). 

Let us consider an alternating current circuit in which the current 
flow j and the voltage u are expressed by the formulas j = 
= jorcos (wt + @) and u = Up Cos (ot +a + p) where p is a con- 
stant phase shift between the voltage and the current. The mean 
power of the current in this circuit is equal to 


T 
h=ju= lim + Í jo cos (ot +a) Up Cos (0t +a + P) dé = 
T->00 9 
T 


= lim 2% \ [cos (20t -+ 2c + ) + cos @] dt = 
T=% 2T v 
= lim E P e ee eri Ca ak ee Cara) + Joe cos o} = 1% cosp 


2 
T-0o ` 
This formula accounts for the significance of the quantity cos P 


in electrical engineering. 

In conclusion, we give one more inequality which is sometimes 
applied. Since the absolute value of a sum cannot exceed the sum 
of the absolute values (see the end of Sec. 1.5) we can write the ine- 


quality 
| 3) Fe) Azul <2) 17) Azal = 217 Aaa 


for integral sum (6). Then passing to the limit we get 
b 


|f rea|< irela (20) 


@s a 


28* 
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In other words, the absolute value of an integral does not exceed 
the integral of the absolute value of the integrand [think in what 
cases inequality (20) turns into the equality]. 


§ 2. Applications of Definite Integral 


6. Two Schemes of Application. There are two basic schemes of 
application of the definite integral to calculating geometrical, phy- 
sical and other quantities. 

The first scheme is based on the definition of the integral which 
introduces it as the limit of an integral sum [see formula (7)]. 
According to this scheme, a quantity in question is approximately 
represented as an integral sum so that the representation should 
become more and still more precise, as the lengths of the subintervals 
of partitions are decreased, and quite exact after the passage to the 
limit. Therefore the quantity turns out to be equal to the limit of 
the integral sums, i.e. to the integral. This method was clearly illu- 
strated by the examples considered in Sec. 1 which led to four inte- 
grals (8). As it was indicated in Sec. 2, this scheme is based on the 
representation of a definite integral in the form of a sum of infinitely 
many infinitesimal summands. 

The second scheme of application of integrals consists in forming 
a relationship between the differentials of quantities in question, 
that is in forming a so-called differential equation. After the rela- 
tionship between the differentials has been deduced we apply for- 
mula (13), which can also be put down in the form 


j dy = Yterminal — Yinitiat 


and thus obtain a relationship between the quantities themselves. 
The meaning of the above formula is that the sum of infinitesimal 
neg of a quantity is equal to the total increment of the quan- 
tity. 

Let us consider an example. Suppose a point is moving along the 
s-axis under the action of a variable force which is directed along 
the axis and assumes the value F (s) at each point s. Let the point 
pass the distance from s = a to s = b and let it be necessary to 
compute the work Aiora; of the force along this path. The work A 
of the force performed in the process of motion is connected by 
a functional relationship with the distance passed, that is A = A (s). 
When the point passes a small interval from s to s + As tho force 
does not change considerably and we can therefore approximately 
regard it as constant along this small path and, according to the 
well-known physical formula, we can write 


AA = F (s) As. 
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A more precise formula has the form 
AA = F (s) As+ a (24) 

where |a | & As, i.e. œ is of higher order of smallness relative 
to As. The fact that a is indeed an infinitesimal of higher order 
of smallness is implied by the following consideration: œ is caused 
by the variability of F on the interval As but the variation of F 
is infinitesimal when As is infinitesimal and, besides, this variation 
is multiplied by As when AA is calculated. 

Now if we recall that a differential is defined as the principal 
(linear) part of the corresponding increment (see Sec. IV.8) we can 
write, on the basis of (21), that 


dA = F (s) ds (22) 
Integrating we obtain 


b 
Ajotat= A (b) —A (a) = dA= \ F (s) ds 


This formula is often written in a simplified form as 
A= | Fas 


Although the limits of integration are not put down in the last 
formula the integral is understood as a definite integral having cer- 
tain limits of integration. 

In problem-solving practice the above detailed consideration is 
usually replaced by the following simplified consideration: the 
force can be regarded as constant along the infinitesimal path ds 
and this immediately implies formula (22) for the corresponding 
infinitesimal increment of the work. Then formula (22) is integrated 
etc. This consideration is brief but quite correct, and if we discuss 
it at length we shall arrive at the comprehensive consideration which 
was given previously. We shall turn back to this question in Sec. 
XVL4. 

7. Differential Equation with Variables Separable. The general 
form of a relationship of type (22) can be written as 


dy = f (x) dx (23) 
where z and y are some variables connected by a functional rela- 


tionship. Integrating we deduce 
xi 


yı—v= | f (0) da 


xo 


where yo = y (o) and y, = y (24). 
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Equation (23) is the simplest differential equation. Differential 
equations will be treated in more detail in Chapter XV but some _ 
simple examples that can be considered without applying the gene- ` 
ral theory can be illustrated here. For instance, we often encounter 
differential equations of the form i 

dy = 9 (y) dx (24) 


We cannot simply integrate both sides of the equation because in ~ 
this case the integrand under the sign of integration on the right- 
hand side would contain an unknown function y (x). We must there- 
fore transpose @ (y) to the left-hand side, that is we must write 


aw = dz beforehand. Then the integration yields 
us 


d 
| strain (Yo =y (zo), y1=y (z:)) 
Yo 


Similarly, an equation of the form k 
dy = f (x) p (y) dx (25) 
is integrated as follows: 


Yi xi 

dy dy 

Surat (2) de, {= = f f(x) dz 
yo xo 


Equations (23)-(25) are called differential equations with variables 
separable because the terms containing x and dx can be separated 
from the terms containing y and dy by means of simple algebraic 


transformations, and after this the integration is carried out imme- | 


diately. 


As an example, let us consider the problem of outflow of a liquid 
from a cylindric vessel through an opening of area o at the bottom 
of the vessel (see Fig, 251). Here the height h of the level of the 
liquid above the bottom depends on the time ż, i.e. k = h(t). If 
the liquid is not viscous, and if it is permissible to neglect the forces 
of surface tension, the exit velocity v with which the liquid flows 
out of the vessel is described, within a sufficient accuracy, by Torri- 
celli’s law (established by E. Torricelli, 1608-1647, a prominent 
Italian physicist and mathematician): 


v=V2gh (26) 


We can readily form the differential equation of the problem on 
the basis of this law. Let us involve brief considerations similar 
to that in the last paragraph of Sec. 6. The exit velocity can be 
regarded as constant during the time interval dt, and therefore, by 
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formula (26), the corresponding outflow is the volume dV = o.v dt = 
- o V 2gh dt of the liquid. 

On the other hand, the same volume is equal to dV = S | dh | = 
—S dh. (One should take into account that h decreases here and 

therefore dh < 0.) Equating both expressions of the volume we 

obtain the equation 


—Sdh=oYV 2ghdt (27) 


belonging to type (24). In order to integrate the equation let us 
separate the terms depending on k (and on dh) from the terms depend- 
ing on ¢ (and dé): 

S dh 


ayia 
Integrating we receive 
0 
Sdh 
-| eat =n) 


H 
where T is the total time of outflow of the liquid. We finally obtain 


S ħ=0 _ ; ANA AA 
syg VaT, Ske)... unt, = am 


8. Computing Areas of Plane Geometric Figures. The application 
of the definite integral to computing the area of a curvilinear tra- 


5 
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Fig. 254 Fig. 252 


pezoid was considered in Sec. 2, and the corresponding rule of signs 
was illustrated in Fig. 245. 

If it is necessary to compute the whole area shaded in Fig. 245 
in the “arithmetic” sense (i.e. not in the “algebraic sense”) we can 
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utilize the formula 
b 


Si+S2+S5= | |f(2) |dz 


a 


where the last integral can be evaluated according to the scheme 
described in example 1 of Sec. 4. 

The computation of the areas of figures other than curvilinear 
trapezoids can also be performed with the help of integrals, For 
instance, the area of the geometric figure shown in Fig. 252 can 
be obtained as the difference of the areas of the two curvilinear 
trapezoids ACMDBA and ACNDBA, i.e. 


b b b b 
S= | nede- | n d= [1 @)—h@)dr=J h(e)de 8 


a a 


where h (x) is the length of the segment which is formed when the 
straight line parallel to the y-axis and passing through the point z 
of the z-axis crosses the figure. 

Formula (28) can also be interpreted as follows. The area of the 

portion of our figure lying to the left of the straight line z = const 
depends on z. If we give z an infinitesimal increment dz (see Fig. 252) 
then the area of the strip shown in Fig. 252 is added to the former 
area. This additional area can be regarded, to within infinitesimals 
of higher order of smallness, as a rectangle [compare this with the 
deduction of formulas (24) and (22)]. It follows that dS = h (x) dz. 
Now, integrating, we obtain formula (28) again. 
7 The contour of a figure, that is the curve which bounds its area, 
is often represented in parametric form. In such cases it is advisable 
to perform a change of variables in the integrals in question and 
choose the parameter as a new variable. 

For example, let us compute the area bounded by the z-axis and 
by an arc of a cycloid (see Sec. II.6) with parametric equations 
(11.12). We mean here a part of the cycloid which connects two 
neighbouring points of intersection of the cycloid with the z-axis 
(i.e. an arc lying between two cusps), and the parameter should 
therefore be taken within the limits 0 <p < 27. Hence, 


2nR 


2n 
s2 f ydr= | R(1—cosy) d[R (p—sin y)] = 
J : 


0 


2n 
=R? f (1— cos 1p)? dp = 3n R? 
poe 


Ve ee T 
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because 
f (4 —cos 1p)? dp = f (1—2 cos þ -+ cos? p) dp = p — 2 sin p + 
+R ayes p—2sinp+ +C 


Now let us consider the area of a geometric figure bounded by. 
a closed contour (L) represented by its parametric equations z = 
= x (i) and y = y (t). Suppose that the variable point (xz, y) [where 
z = x(t) and y = y (#) describes the contour in the positive direc- 
tion once when the parameter ¢ varies from « to y (the positive direc- 
tion is understood as counterclockwise; see Fig. 253). Then 


yo dz 


ae 


b 
D= f y,dx— 


B 
But the first integral is equal to į ya dt because x varies from 
? 


a to b as t varies from y to f (see Fig. 253) and we have y = y, and 
+ dt = dx here. The second integral is transformed similarly and 
thus we obtain 


B . B . B . x . w . 
E Rey te — dt — dt=— d 29 
S [azas poe jra ee vee (29) 


Property 10 in Sec. 4 implies that the values ¢ = œ and t = y 
are not necessarily such that the corresponding points (x, y) should 
coincide with the extreme left point of the contour; the necessary 
condition is that the contour should be described exactly once as t 


varies from & to y. : 
Similarly, projecting the contour on the y-axis we can deduce 


the formula 


Y 
JS | zy dt (30) 
a 


yd 
s=z] (xy — yz) dt (31) 


If the contour is described in the negative direction, as ¢ increases, 
we must change the signs in all formulas (29)-(31). 
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For example, the area of an ellipse having parametric equations 
(11.26) can be computed on the basis of formula (31): 


27 


2x 
Say f [a cos ¢-6 cost —bsint (— a sin ¢)] dt =— \ ab dt = nab 
0 0 


Let us proceed to compute areas in polar coordinates. Let a curve 

` be represented by its polar equation p = f (ọ) and let it be necessary 
to compute the area of the curvilinear “sector” œ < p < P (see 
Fig. 254). If the angle 9 is increased by dọ then the area of the por- 
tion of the sector lying below the ray ON also gains an increment, 


Fig. 253 Fig, 254 


the additional area being regarded as an isosceles triangle with 
altitude p and base p dp within an accuracy to infinitesimals of 
higher order of smallness (why is it so?). Hence, 


B 

4 

dS =— pp dọ, S=+ | do (32) 
a 


As an example, let us compute the area shaded in Fig. 255. 


ii to polar coordinates in the equation of the hyperbola we 
obtain 3 


2 cos? 0 — 0? sin? o — 7 2 4 
p p— p sin?p=1, ie. p= OEE 
Consequently, by formula (32), we derive 


9 
4 í 4 1+f 
S=> | sat dp = 1 nitg 
2 A wot panty 2° 4 UR 


(verify the calculations!). 
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The above result implies an interesting consequence. We have 
28 


1+ tan z eS _4 Se Gir 
Aor tAn © es C88, 


s 
2e m 9 
(tae i.e. tang@= =tanh 25, 


(see Sec. 1.28), and therefore 
NM= sin Se ee ae SLD ee = 
PS Veostp—sin?g V1— tan? p 
tanh 2S tanh 2S 4 
oe Vi tanh 2s oe pe sinh 2S 
cosh 2S 


We similarly find that ON = p cos ọ = cosh2S and AP = 
= tan pọ = tanh 2S. The comparison of these results with Fig. 256 
where pọ = 25, MN = sin 25, ON =cos2S and AP = tan 28S 


xr 2y7sh 
Gs \ 
‘ 
RN 


Fig. 255 Fig. 256 


reveals the geometric meaning of the resemblance between trigo- 
nometric (circular) functions and hyperbolic functions and accounts 
for the origin of the terms “hyperbolic” sine, cosine and tangent. 

9. The Are Length of a Curve. We have already dealt with the 
differential of the are length in our course (see Sec. VII.23). We 
shall denote it by dL: 


dL- Vde pap =V, a+ y+ edt 


We shall agree that dL >0 and therefore take the sign + in 
front of the radical. It follows that if the values ¢ = a and t = B 
of the parameter correspond to the end-points of the arc then its 
length is 


B ANS es 
L= |V erp +eae 


aœ 
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For a plane curve the formula of the arc length is simplified: 


B B . . 
L= | Vad- | V +ydt 
a a 


If a curve is represented by an equation of the form y = f (x) 
(a < x < b) then 
B b 
L= \ Vdr? + dy? = | VIFy”dz 


a 


For instance, the are length of the part of a cycloid [represented 
by equations (I1.12)] between its neighbouring Eanes (cusps) is 
computed with the help of the above formula: L= f P+y d= 


a 


ai VEHES dp. In computing the arc length we take 
ary 


into account the symmetry of the arc: 


L=2 | RV (p—sin yw? + (1 — cos yap = 
0 
VU osp) tsin y} dp= 2R \ V2—2cosyp dip = 


0 


=2R 


owna 


n 
=4R | sin} dp= —8R cos -$= 8R 
0 


The result is extremely simple! 


The differential of the arc length in polar coordinates is readily 
obtained from Fig. 257: 


dL =V (dp F (0 doy (25) 
It follows that if the equation of a curve in polar coordinates is given 


in the form p = f (ọ) then its arc length corresponding to the inter- 
val a < p < of variation of @ is equal to 


B pi ea 
t= | Værre- | y (E) FP a 
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We suggest that the reader should check that expression (33) can 
also be obtained from the formulas 


dL=Y dz*+dy*?, x=pcos®@ and y=psin@ 

As an example, let us consider the arc length of the cardioid 
depicted in Fig. 72. Taking advantage of its polar equation put 
down in Fig. 72 we find the arc length: 

x poly Ui er Sea ee 
L=2 f V ža sin? @ -+ 4a? (1 — cos p)? dp = 16a 
0 
(verify the result!). 

10. Computing Volumes of Solids. Suppose that we have a solid 
and that we know the areas of its parallel sections by planes per- 
pendicular to an axis (see Fig. 258). Let it be necessary to compute 
the volume of the solid. Let z be the coordinate reckoned along the 


Fig. 257 


axis and let the area of the section by the plane passing through 
the point z be S = S (z). Take the volume of the part of the solid 
lying to the left of the plane passing through z. If we increase x 
by Ax = dx (for definiteness, we take Az > 0) then the plane is 
moved to the right, and an additional volume is added to the former 
volume. This additional volume is the volume of the “slice” which 
can be regarded as a cylinder with a wide base of area S (x) and 
small altitude dz, within an accuracy to infinitesimals of higher 
order of smallness. It follows that 


AV = S (a) Az + infinitesimals of higher order 
dV = S (a) dz 


Now if x varies from a to b we obtain 


i.e. 
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b 
p= f S (x) dx (34) 


Let us take an example. We shall compute the volume of a solid 
which is bounded by the surface of a right circular cylinder, by the 
half-dise lying in a plane perpendicular to the axis of the cylinder 
and by the part of an oblique plane passing through the diameter 
of the disc. The solid is depicted in Fig. 259; it has the form. of a 


y 


L 
A 7 


STE a 


Fig 259 Fig. 260 


“hoof”. Since the triangles ABC and A’B’C’ are similar according 
to Fig. 259, the area of the section which is shaded in Fig. 259 is 
equal to 

Rr R222 


4 
S=> RH A = oR H: 
Consequently, by formula (34), we have 
R R 
ER R R—z? U: z3 \ |R 3 2 
Mest cere m Hde=> (Re — >) [=F RH 


Note that the number a does not enter into the answer! 

We now consider the volume of a solid of revolution. Let a curve 
having the equation y = f (x) rotate in space about the z-axis and 
let it describe the boundary surface of the solid of revolution. Then 
the area of the section of the solid by the plane perpendicular to 
the z-axis and passing through the point z is equal to S = my 
where y = f (x) (see Fig. 260). Hence, by formula (34), we have 

b b 
Ven yde=n f f(x) dx (35) 
a a 

For instance, we can regard a sphere of radius R as a solid gene- 

rated by the revolution of the semi-circle having the equation y = 
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= ) R? — x°. The volume of the sphere is therefore equal to 


R 
V=n2 f (V R=) dr= 27 (Rz 2>) p =n 
0 


Compare the above deduction of the formula with the long and 
artificial procedure applied to deducing the formula of the volume 
of a sphere in elementary mathematical courses. 

11. Computing Area of Surface of Revolution. The formula for 
computing the area of an arbitrary surface will be established in 


SSIS AS 
an 


g 


Fig. 262 


Fig. 264 


Sec. XVI.10. But the computation of the area of a surface of revo- 
lution can be discussed now. Leta curve y = f (z) >0 rotate about 
the z-axis (see Fig. 261). Let us draw a plane perpendicular to the 
z-axis and passing through the point z where z is considered to be 
variable. Take the area of the portion of the surface of revolution 


i f the plane. 
ne ay? aa a by the distance dz to the right the area 
of the portion is increased by the surface element which is shaded 
in Fig. 261. The element has the form of a ring-shaped surface of 
width dL with the circumference (length) 2my because the radius 
of the ring is equal to y. Consequently, we have 


dS = 2ny dL = 2ny V1 Fy” dx 


and thus 
b b 
S=27 f yal =2n | y ity? de (36) 


a 


448 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


As an example, let us compute the area of the portion of a para- 
boloid of revolution intercepted by a plane perpendicular to the 
axis of revolution (see Fig. 262). Let the radius R of the base and 
the altitude H be given. The surface being generated by the revo- 
lution of a parabola whose axis coincides with the z-axis, the equa- 
tion of the curve has the form z = ky?. The constant k must be 
chosen so that the parabola should pass through the point (H, R). 


This implies H =kR?*, i.e. k = a Finally, the equation of the 
curve is 


s=? œ y=RV 7 
Taking advantage of formula (36) we obtain 


s=2 Í ry eV t+[(@V 3) Pae= 


he = H 
P REL - R 4 R <i 
Fo aV r eera Vei- 
0 
__ R (eH + R2)3/? |E ; 
SSE aR 
ao eani = am (GH + RY — BI 


$ 3. Numerical Integration 


: 12. General Remarks. The basic method of evaluating a definite 
integral by means of the corresponding indefinite integral described 
in Sec. 3 is sometimes inexpedient and even practically inappli- 
cable. As it was indicated in Sec. XIII.11, there are many indefinite 
integrals of elementary functions which cannot be expressed in 
terms of elementary functions or which have such expressions that 
are too complicated. Besides, a function which we have to integrate 
can be represented in a way which does not yield its analytical 
expression. In these cases one can use a number of methods which 
we are going to review here. 

1. Some integrals are expressible in terms of certain thoroughly 
studied and tabulated non-elementary “special” functions. 

For instance, one of these functions is the error function 


Erfe={ e=" dt (—% < z< %0) (86°) 
0 


Further examples are the Fresnel integrals (named after A. Fresnel, 
1788-1827, a prominent French physicist, the creator of the wave 
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theory of light), 
x x 
i 2 - 
C (a) | cos at and S(2)= | sin dt (—o<r<o) 
0 u 


the exponential integral 
tet 
Eiz= j dt (—w<2<0) 


the sine integral 


x 
Sic= f sata (Soar) 
0 
the cosine integral 
x 


Cis | cot dt (0< z< 0%) 


and many other functions. Integrals with infinite limits will be 


considered in. full in § 4. i 
Let us take an example. In order to evaluate the integral 


1 
I= | da 
J i 


x? 
we integrate it by parts putting u=sin®x and dv=x"* dz; 


sin? z 


las 


1 i 
1 i : in 2: 
ines Zeinena tes sinnt f = dis 


Now performing the change of variable 2r=tżt we get 
2 
ES —sin?1 + | sint gy = — sin? 1 +Si2=0.8973 
0 
The value of the sine integral is taken from [23]. A great number 
of special functions are described in this book. 

2. Tt is sometimes possible to find the exact value of a definite 
integral with certain specific limits without calculating the corres- 
ponding indefinite integral. Calculations of this type are usually 
difficult but nevertheless we shall give some examples further [for 
instance, see formula (72)]. Many integrals of this kind are collected 
in [49]. 


29—0141 
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For example, in this book we can find the formulas 


x 
Í tan” zdz =% (cos) (—1<p<1), 
d 


m 


j ln sin zdz= — n ln 2 etc. 

0) 
but the corresponding indefinite integrals are not elementary func- 
tions. 

3. Expansions of the integrand into series of different types are 
also often used for integration. This method will be described at 
length in Chapter XVII but the simple power series which were 
mentioned in Sec. IV.16 can be readily applied now. 

For instance, taking series (IV.55) for the function eë we obtain 


í TURE pee 
| Sa í 1 2! A 
z x 
0 0 
ti 2 1 1 1 
T x . A 
=| (atats eke tat ++. = 1.318 


0 
(the result is accurate to 0.001). 

It was noted in Sec. IV.16 that in practice such series can be 
treated as finite sums in which the number of terms is taken depend- 
ing on the accuracy chosen. 

4, Graphical integration is used when a function is represented 
by its graph. The method is based on the geometric meaning of the 
definite integral (see Sec. 2) which implies that the integral is equal 
to the area of the corresponding curvilinear trapezoid. We can com- 
pute the area either by constructing the graph on the plotting paper 
and figuring the number of squares lying inside the bounding line 
or by using a special instrument, the so-called planimeter. After 
the tracer of the planimeter has been passed round the periphery 
of the area of an arbitrary form which is to be measured we read 
the area on the meter of the planimeter. Since a planimeter is a 
simple mechanism integration with its help is called mechanical 
integration. 

5. The most comprehensive methods applicable to integrals of 
arbitrary functions are the methods of numerical integration which 
are reviewed in Sec. 13. These methods can be used for functions 
represented in any possible way, especially for functions represented 
by means of tables. 

13. Formulas of Numerical Integration. These formulas make it 
possible to evaluate approximate values of a definite integral if 
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the values of the integrand at certain points (so-called nodes) of the 
interval of integration are given. 
Let us begin with the most elementary formula. Let it be necessary 
to evaluate the integral 
b 
| yde, y=fl) (37) 


a 


Suppose that the interval of integration a < x < b is divided into 
a finite number n of equal parts and that the values of the integrand 
at the points of division are given or calculated. 
Introduce the notation 
ies 
f=h, f(a=y, farh)=yn 


flat 2h)=yo veer f(a+nh)=fO)=Yn 


If we draw the ordinates at each of the nodes the curvilinear tra- 
pezoid whose area is equal to integral (37) is broken into n parts 
(see Fig. 263). Each of these parts 
is also a curvilinear trapezoid. Now 
let us replace the parts by rectili- 
near trapezoids whose bases are 
pairs of neighbouring ordinates (see 
Fig. 263). The areas of these trape- 
zoids are equal to 


Yoru Yat Yo Yn-1+Yn 
pih, MOB h, oo gh 


ney 


Adding the areas together we 
obtain the area of a polygonal figure j 
inscribed into the original curvi- Fig. 263 
linear trapezoid. If n is sufficiently 
large, that is if h is sufficiently small, the area of the polygon will 
be approximately equal to the area of the curvilinear trapezoid, 
i.e. to the integral. Thus, we obtain 


b 
? Yn- TY 
poun e ppi ng.. He h 


a 


or 
b 
( yazan (25 tutu t -H Yes) (38) 


a 
This is the so-called trapezoid formula (trapezoid rule). 
We can give an interpretation of the trapezoid formula which 
is independent of its geometric meaning. Virtually, before inte- 


29* 
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grating the function we substituted it by linear functions which ~ 
take on the values of the integrand y = f (x) assumed at the end- 
points of each interval of the form a < wz <atha+h<trgy 
<a- 2h etc. Hence, we can say that we have performed linear 
interpolation (see Sec. 1.22) before evaluating the integral. Now 
let us recall that we gave interpolation formulas in § V.2 which 
approximate functions with a greater accuracy than formulas of © 
linear interpolation. Therefore formulas of numerical integration 
based on these interpolation formulas are much more accurate than 
formula (38). i 
If we use interpolation polynomials of the second degree we shall 
arrive at Simpson’s formula named after the English mathematician 
T. Simpson (1710-1761) who deduced the formula. Let us first suppose ~ 
that we know the values of the integrand at three points of the form ~ 
Xo, Zo + h and z + 2h, that is we are given D 


y (z) = yo Y (zo +h) =y and y (to + 2h) =y 
Then we can put down the interpolation polynomial of the second 


: degree which assumes the same values at these points. By Newton’s 
formula (see Sec. V.8), the polynomial is 


P(2)=y + Amt (41) Csr) 


Now performing the interpolation on the subinterval zo < 2 = 
<x + 2h we get, by means of the substitution £ — xr) = s, dx = © 
= ds, 0 < s < 2h, the expression ; 


xot-2h 2h - F 
| P@adr= | [nrin (r) 
žo 0 

= Yo 2h + Ayo: 2h + Si Ey 


Further, if we substitute 
Ayo= y1 — Yo; y 
Ayo = Aya — Ayo = (Ye — Ys) — Ys — Yo) = Ya — Yi + Yo 
then after collecting similar terms we obtain : 
xo-L2h xp-f2h 
f(a) dx = j P (2) de = 
xo Xo 


= 2h [y+ (yy) +5 U 2+ yo) =A Ett (39) 


Now suppose that the interval of integration a < x < b is divi- 
ded into 2n equal subintervals with the help of the points of division 
=a, aSa + hy t%=—a-+ 2h, ... 
e+) Lon = a+ Anh = b 
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where h = z. Then we can apply formula (39) to each pair 


of subintervals of the form zo + (2k — 2) h < z < To + (2k — 41) h 
and 2o-+ (2k —1)h [Sar tot 2kh (k=1, iiss oR): 


xo p2h xotáh 
| ydzæh printi ; ydz~h Yat As tYa 
xo Xo+2h $ 
xot2nh 
aa Le ydræh Yan- tibani tian 
xot(2n-2)h 


Adding together these formulas and collecting similar terms we 
receive Simpson’s formula: 
b 


j y dx = + (Yot Yon) +2 (Yo Yt + Yan) +4 Yt Yat: +Yan-1)] 
Ta (40) 


Now we proceed to estimate the accuracy of formulas (38) and 
(40). Newton’s formula (V .27) implies that in performing linear 
interpolation on a subinterval we get an error of the order of A’y, 
i.e. of the order of h? (see Sec. V.7). According to formula (18), we 
can estimate the corresponding absolute error of the integral taken 
over the subinterval if we multiply the error of the interpolation 
by the length h of the subinterval of integration. Hence, the error 
of the integral on a subinterval is of the order of kè. Formula (38) 
is obtained by adding together n approximate formulas having the 
errors of the order of h°. Therefore, the number of subintervals being 


equal to n=," the resultant error is of the order of 


neh? = 2—* ho = (b—a) -hè 


that is of the order of h*. For instance, if we increase the number of 
the division points twice the degree of accuracy of formula (38) 
will increase, approximately, four times. 

One can think that analogous considerations applied to formula 
(40) must indicate that its error is of the order of kè. But this is 
wrong because in reality the accuracy is still higher. Actually, when 
dropping the term containing Ayọ which enters into Newton's 
formula we make an error of the order of k?. But it turns out that 


the integral of the term is equal to zero because 
cof2h 2h 
fs (2-1) (4-2) a= TG -e (itte 


s4 3s3 2s2 \ |2h 0 
s 3h? ad i fi: 
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Hence, the error which” occurs, after integration, is determined by 
the subsequent term of Newton’s formula. This subsequent term 
is of the order of h*. Consequently, the error of formula (39) is of 
the order of h® and the error of final formula (40) is of the order 
of h*. For example, if the number of the points of division is increa- 
sed twice the accuracy of formula (40) increases 16 times. At the 
same time the application of formula (40) is not much more compli- 
cated than that of formula (38). 

Let us take an example. The exact value of the integral I = 


= | rH dx is readily found: 


es | rpa de= are tan z |) =2 = 0.785 


If we did not know the answer we could evaluate the integral appro- 
ximately by means of formula (38) or (40). For simplicity’s sake, 
let us take n= 2, that is 
h=05, 2 =0, 2, =0.5,  2,=1, yo= 1.000, 
yi=0.800, y, = 0.500 
Applying formula (38) we get the value 


I~ 0.5 aa ay 0.800) =0.775 


whose error is ~1.3 per cent. The calculations according to for- 
mula (40) yield the value 


I ~°2 (14.000 +0.500 + 4 x 0.800) = 0.783 


the error being about 0.3 per cent. If we had taken n = 10 the error 
of formula (38) would have been about 0.05 per cent and the error 
of formula (40) about 10 per cent. i 

In books devoted to numerical methods in mathematics one can 
find formulas of approximate integration that are more accurate 
than Simpson’s formula. We sometimes use formulas constructed 
for nodes which are not equally spaced. 


§ 4. Improper Integrals 


Up till now we have considered definite integrals with finite 
intervals of integration and with integrands which do not 
infinity on the intervals. We shall call such integrals proper. If 
at least one of the above conditions is not fulfilled the integral is 
called improper. A proper integral of a continuous function (and 


approach 
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of a function of some wider class) always has a certain numerical 
value. In contrast to it, improper integrals which we are going 
to study here may not have such a value. 

14. Integrals with Infinite Limits of Integration. First let us take 
an integral of the form 


o0 


I= | f(a) de (41) 
d a 
where the lower limit a and the integrand f (x) (considered on a < 
< z< œ) are supposed to be finite. This integral is improper be- 
cause its upper limit is infinitely large. 

To define integral (41) in an exact sense we use the same approach 
as that applied in Sec. III.6 to the sum of an infinite series. Namely, 
we first “truncate” the integral, i.e. we cut off an infinite portion 
of its interval of integration and consider the integral 

N 
| f(a) de (42) 


a 


where N is a large but finite number. Integral (42) is proper and 
possesses a certain numerical value. Then we make N tend to infinity 
because in integral (41) we have the sign of infinity as the upper 
limit of integration. Integral (42) varies in a certain manner as 
N— oo. If, in this process, it has a certain finite limit we say that 
integral (41) is convergent. In this case we put, by definition, 


f j (2) dz =lim | f(x) dx (43) 


If there is no finite limit integral (41) is said to be divergent. 
In such a case we shall not define a numerical value of the integral 
in our course (although even in this case it is sometimes possible 
to speak about the value of the integral). Hence, we shall speak 
about the numerical value of an improper integral of type (41) 
only if it converges. Aes 

Let us note a particular case of divergence: if integral (42) has 
an infinite limit as N > o0 then integral (41) is said to diverge 
to infinity, in this case formula (43) makes sense and can be used. 

Let us consider several examples. Suppose a point T is moving 
under the action of a force which is directed along the straight 
line connecting 7 with a fixed point O (from T to O) and whose 
absolute value is inversely proportional to the square of the distance 
from O to T. In particular, gravitational force and force of attrac- 
tion between two charges of electricity are of this kind. Suppose 
it is necessary to compute the work which should be expended to 
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remove the point T from a position Tọ into infinity. This work is 
called the potential of the force. 

To perform the computation let us write the expression of the 
force: 


F= GET) 


where k is a proportionality factor. Then, by Sec. 6, the work is 


Ais f Sds (o —=0T) (44) 
s0 


aap is an improper integral which must be calculated by formula 
(48): 


N 
i F =N 5 À e 
A=lim | slim ES el (£-4)=+ 
Noo u £ N> S |8=80  N-400 \ $0 N So 


Thus, integral (44) converges. We see that the potential is inver- 
sely proportional to the first power of the distance from O to To. 
At first glance one can find it strange that the work corresponding 
to the motion along an infinite path turns out to be finite although, 


Fig. 264 


theoretically, the force never stops acting on the point. The expla- 
nation is that the force decreases so fast as the point moves towards 
infinity that the expended work tends to a finite quantity but not 
to infinity although it increases all the time. The geometric meaning 
of the result is illustrated in Fig. 264: despite the fact that the shaded 


geometric figure extends to infinity its altitude (that is its ordinate 


k 
= 3) decreases so fast that its area turns out to be finite. 


Of course, in reality s varies within a finite range extending not 
to infinity but to a finite value S (which is very large) because all 
physical quantities are finite. Hence, in reality we have an integral 
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of the form 
B 


| ars (45) 


3 
80 
But beginning with some sufficiently large value of S this integral 
does not practically vary and therefore it can be replaced by the 
“limiting” integral (44) because it is easier to investigate (44) in 
theoretical studies as the value S is not known exactly. The signi- 
ficance of the convergence of integral (44) is that it provides for 
the possibility of replacing “real” integral (45) by integral (44) for 
large S. From the physical point of view this means that the action 
of O upon T can be neglected when the distance from the point O 
becomes sufficiently large. In performing such a replacement we 
do not need the exact value of S; the only important thing here 
is that we must be sure of S being sufficiently large. 
As a second example, let us consider the improper integral 


Şa 
f dz (46) 
i 
Since 
z N 
lim f 4 Jz = lim Inz = lim mN =ln œ = œ 
N-0o rf Nœ 1 N-+00 


we see that integral (46) diverges to infinity. Thus, we can write 


f £ eee] 
Tt 

1 

Finally, consider the improper integral 


œ 


f sin z dx (47) 


0 
In this case we have neither a finite nor an infinite value of the 
limit 
N N 

lim | sin zdz =lim (—cos 2) 3 = lim (4 — cos N) 

N00 4 N>% N-+00 
because it does not exist since the values of cos N, as N > œ, 
“oscillate” within the limits from —1 to +4 all the time. Conse- 
quently, integral (47) does not diverge to infinity but it diverges. 
in an oscillating way, its values constantly varying from 0 to 2 


458 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


and back from 2 to 0. Hence, the integral has neither a finite nor 
an infinite value. 

15. Basic Properties of Integrals with Infinite Limits of Integration. 
Many properties of proper integrals are automatically extended to 
improper integrals of form (41). 

First of all, if an antiderivative F (x) of the function f (x) is known 
in the interval a < z < œ then 


oo N 
| f(@)de=lim | f(x) de = lim [F (N)—F (a) =F (00) — F (a) 


because F (co) is nothing but the notation for lim F (N). Hence, 


e N-0o 
in this case improper integral (41) can be evaluated by means of 
formula (11) deduced for proper integrals. The expression F (o9) 
itself indicates whether the integral diverges or converges. 
For instance, in examples (44), (46) and (47) we could have cal- 
culated in the following way: 


“4 ES 
| $@e=inz|* In o% In4=oo and 
1 


: joo 
sin z dz = =cose | = — cos o +1 
0 


The last result shows in fact that integral (47) does not exist since 
the expression cos oo makes no sense. 

All the basic properties enumerated in Sec. 4 also remain true 
for improper integrals with natural exceptions involving the case 
of a divergent integral. For instance, formulas (18) and (19) no 
longer hold because an integral of a nonzero constant taken over 
an infinite interval always diverges. We also note the following 
simple property: if integral (41) converges we have 


| feoae= Fre) da fe dz—>0 


If it is difficult to compute the corresponding indefinite integral 
we usually begin the investigation of the improper integral with 
testing its convergence or divergence on the basis of special tests 
which we are going to study now. 
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First of all, it should be noted that the convergence or divergence 
of integral (47) is determined only by the behaviour of the function 
f (£) as x approaches infinity, that is for sufficiently large z. In 


% 


other words, the integrals j f (x) dx and f f (x) dz converge or 


a 

diverge simultaneously provided f (z) does not approach infinity 
at a point lying between a and b. In fact, the difference between the 
integrals is a proper integral having a certain finite numerical value 
and therefore it cannot violate the convergence in case one of the 
integrals converges and it cannot provide 
for the convergence if one of the integ- Y 
rals diverges. 

We first consider an integral of a non- 
negative function: 


oo 


| f@dz, f@>0 (48) 
A ; 
Such an integral either converges or Fig. 265 
diverges to infinity since the integral 
taken from a to N is a non-decreasing quantity here as N — 0, 
and such a quantity either has a finite limit or tends to infinity 
(see Sec. I1I.5). In this case the convergence or divergence means, 
geometrically, that the area of the infinite plane figure shaded 
in Fig. 265 is, respectively, finite or infinite. 

The fact that integral (48) converges or diverges in this case 
can be indicated, respectively, by the formula 


Rp e 


æ 


| f(z) dr<% 


a 


or 


œ 


| f@)dz=00 


a 
Of course, these formulas cannot be applied to divergent integrals 
of “oscillating” type (47). j : 
The simplest test for convergence 1s the comparison test: if 


0<g(2)<f(z) (a<zr< 0) (49) 


and f f (a) dr < œ, that is this integral converges, then we also 
a 
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œ 


have je (x) dx < oo, that is the last integral converges too. The 


proof of the test is implied by integration of inequality (49) or by 
the geometric meaning of convergence (see Fig. 265). The same test- 
suggests that if conditions (49) hold and the integral of g (x) diverges 
(equals infinity) then the integral of f(z) also diverges. 

We also use the following test: if 


1 k20. (kw) . (50) 


g(z) xœ 


then the integrals 


| 1(@) ae and | eae 


converge or diverge simultaneously (although in the case of conver- 
gence their numerical values can considerably differ even if k = 1 
and a = b). Indeed, condition (50) implies that neither of the func- 
tions f (z) and g (x) can be considerably larger than the other as 
z— oo, i.e. f (x) ~ kg (x) where ~ is the sign of equivalence (see 
Sec. III.7). Therefore, if the shaded figure of the type shown in 
Fig. 265 has a finite area for one of the functions the same must be 
true for the other. 

_ Most often we compare a given integral of form (48) with the 
integral of a power function of the form 


qa 
1 


which can be easily investigated in a direct manner. If p > 1 we have 


I= {Pde eee Pot pert ie 
! —pHt 1 (=p) et |; “oo —ppi-p—i — 7 


whereas, for p<4, we have 


Tees z-pt1 So oe ai-p So 
=p+il1~ 4=p I EAA oo? 


Finally, if p=1 then 
tt s 
I= frdez =n co—In41= oo 
1 


Thus, integral (51) converges for p> 1 and diver infini 
ges to infinity 
for p<1. Hence, by test (50), we can draw the same conclusion 
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concerning integral (48) if 


For instance, 


because 

=" 1 

EnF 
z Vipra z 


that is p=4<1 here. We also have 


T d 
<% (52) 
j VFA 
since 
E ER 
Ve+i 3 
z? > 


and thus p=3>1 in this case, Finally, 
f e~* dr< œ 
ù 


because an exponential function (with a negative exponent) decreases 
(as its argument tends to infinity) faster than any power function 
(see Sec. IV.14), and therefore comparison test (49) is applicable. 

Now we turn to integrals of functions which can assume the values 


of arbitrary sign: 
Í (x) dx (53) 


et 8 


where either f (x) > 0 or f (z) <0 for a given x. For such an inte- 
gral we shall give only one test: if 


Jf @|ae<oo (54) 


then integral (53) converges, In this case the integral is said to be 
absolutely convergent, and the function f (x) is called absolutely 
integrable on the interval a < z < ©, 
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To prove the test we introduce the functions ft (z) and f- (x) 
which are the “positive and the negative parts” of the function REA 


a f (z) for x such \that f (x) > 0 
AG {i for x such that f (z) < 0 


and 
0 for z such that f (x) > 0 


eR hre | for z such that f (z) < 0 
These functions are shown in Fig. 266. We can write 
Í (2) = f (@) — f (2) (55) 
If (2) | = # (2) + F (2) 


If condition (54) is fulfilled, the area shaded in Fig. 266d is finite. 
Hence, the areas shaded in Fig. 266b and c are also finite. The ine- 
qualities f* (z) > 0 and f- (x) > 0 holding, the improper integrals 


and 


y=f-lx) i 


Fig. 266 


of these functions converge. Now, by equality (55), integral (53) 
also converges which is what we set out to prove. More precisely, 


oo co œ 


| i) de= | f(a) de—J f(a) de 


a a a 


It can happen that integral (54) diverges, that is it equals infinity, 
whereas integral (53) converges. This is the so-called conditional 
convergence. In this case integral (53) converges but not absolutely. 
In such a case the areas shaded in Fig. 266b and c are infinite but 
at the same time they “balance” so that if we take into account 
their signs the infinities in Fig. 266a “cancel out” which leads to 
a finite result. 
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For example, the integral 


sina 
\ see de (56) 
1 
is absolutely convergent because 
sin z 1 
z? SE 


and therefore the integral of the absolute value of the integrand 
can be compared with integral (51) for p = 2. 
If we perform the same operation on the integral 


sin z g 

\ = dx (57): 

4 
we shall arrive at integral (51) with p = 4 which diverges. Therefore: 
the comparison test does not apply here [see inequality (49) and 
think why the test is inapplicable]. At the same time it is possible 
to prove that 

1 


because the first factor under the integral sign oscillates about the 
positive mean value. But it turns out that integral (57) nevertheless. 
converges. To prove this let us perform integration by parts denoting: 
4 u, du= — 5 de, and sin x dx = dv, v = —cos Ti 
= 


oo 
de= (|sinz|> dz = œ 
1 


sin z 


bars f = dr cos 1— \ El, 
1 x x 
1 1 

The last¥integral is of the same type as (56) and therefore it con- 
verges. Hence, the original integral is also convergent. In other 
words, integral (57) converges conditionally but not absolutely. 

The above results are automatically extended to integrals of 
complex functions depending on a real argument (see Sec. VIII.6) 
and to integrals of the form 


oo 


sin x cos z 
\ dz= 


T 


b 
| £2) de (58) 
which are defined by the relation 
b b 
j f(a)de= tim | fae 


100) -M 
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By the way, we can easily pass from integral (95) to an integral 
of form (41) by means of the substitution z = —y. k: 
16. Other Types of Improper Integral. Now we consider an impro- 

per integral of the form 
b 


j f(a) da (59) 


a 


with finite limits of integration whose integrand is not finite at one 
of the end-points of the interval of integration; for instance let f (2) ~ 
approach infinity or be unbounded as z —> a. To attribute a certain © 
numerical value to integral (59) we cut off an interval containing 
the “dangerous” end-point and pass to the limit after that: 


b b 
j f(x) dx =lim f f (2) dx 
vt erto ot 


As in Sec. 14, in case there is no finite limit we say that integral 
(59) diverges. 

All the properties enumerated in Sec. 15 can be directly extended 
to these integrals but there is a difference between integrals (41) 
and (59) concerning integration of power functions. When investi- 
gating integral (59) by comparison with an integral of a power func- — 
tion we must take an integral of the form 


b 
j Saoti (60) 


(z—a)P 


instead of integral (51). It is easy to verify that integral (60) con- 
verges for p< 1 and diverges for p > 1 (check it up!). 
_ An integral of type (59) whose integrand approaches infinity or 
is unbounded as the argument tends to the upper limit of integration 
is treated similarly. 
g The points of the interval of integration at which the integrand 
is not finite and the end-points of the interval which lie at infinity 
are singularities of the integral. Up to now we have considered inte- 
grals with only one singularity lying at an end-point of the interval 
of integration. If there is a singularity lying inside the interval of 
integration or if there are several such singularities we attribute 
a numerical value to the integral in question according to the follow- 
ing scheme. 

Suppose we have an integral of the form 


b 


| fe) ae (61) 


a 
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and let its integrand f (z) not be finite at the points a, c and d, that 
is let integral (61) have three singularities (the singularities are 
represented by circles in Fig. 267). Then we divide the interval 
of integration into parts by means of points of division (these points 
are represented by asterisks in Fig. 267) so that only one singularity 


i a 
aaeRpd b.¢@ 


Fig. 267 


should lie on each of the subintervals at one of its end-points. We 
see five such subintervals in Fig. 267, namely aa, ac, cb, Bd and db. 
lf each of the integrals 


a 


e p a 6 
fro dz, fr dz, | rŒ dz, |r% dx and fre dx (62) 


a a e B 


converges we say that integral (61) is also convergent, and its value, 
by definition, is equal to the sum of integrals (62). But if at least 
one of integrals (62) is divergent we regard integral (61) as divergent. 
In this case we do not attribute any numerical value to it. 

In particular, the numerical value of an integral of the form 


jie) ae 


where the function f (z) is finite is introduced according to the above 
scheme; to do this we must take one point of division. 

The test described in Secs. 15 and 16 can be applied to each of 
the integrals (62). Therefore they can be used for investigating 
integrals (61). In particular, this suggests that if 


b 


{ \fi@ ldz < œ 


a 


then integral (61) must converge. In this case the integral is said 

to be absolutely convergent and the function f (z) is called an abso- 

lutely integrable (summable) function over the interval a < z < b. 

We now dwell in more detail on integrals having two singularities 

lying at both end-points of intervals of integration. Suppose the 
b 


integral f f (z) dx belongs to this type. If we manage to find an 


antiderivative F (z) of the integrand f (z) the evaluation of the 


30-0141 
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integral can be carried out in the following way: 
b CA Wb 


| fede= | f@) det f f(e) dz= 
F a 7 ies 

=lim | f(a)de+lim | f(x)de= 
e>-+0 ate e>+0 is 


= lim (P (a) —F (a+ 2)1+ lim [F b— 8) —F (|= 
=F (b—0)—F (a+0) 


Thus, in this case we can use our usual formula (11) for evaluating 
the definite integral, and if the substitution of the limits of inte- 
gration for the argument of the function yields finite results the 
integral is convergent. 

We now suppose that integral (61) has a singularity lying inside 
the interval of integration, for instance, at the point z = c. If an 
antiderivative F (x) is known then we have 


b e b 
| f@az= | f(a) det | f@)dz=1F (e—0)—F (a)1+ 


+F (0)—F (c+ 0)] =F (6)—F (a)+ IF (c—0)—F (e+0)] (63) 


Consequently, if the antiderivative has no discontinuities, 1.6. 
if F (e — 0) = F (c + 0), formula (11) can be applied to evaluating 
the integral. But if the antiderivative has jump discontinuities 
we must make the necessary corrections as we have done in formula 
(63). Finally, if the antiderivative has discontinuities of more com- 
plicated types inside the interval of integration, in particular, if 
it approaches infinity at some points, then the integral diverges. 
The above rules apply to the case of an arbitrary (finite) number of 
singularities. 


2 
Take an example. The integral |r daz can be evaluated as 
x 
-4 
follows: 


—1)=0.881 


x 


f K dE f PE M 
4 


because in this case the antiderivative is proportional to 2°, i.e. 
it is continuous, and therefore the integral converges. The last 
example in Sec. 4 was treated incorrectly because the antiderivative 
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== which we had there approached infinity on the interval of 


integration at the point z = 0 and therefore the integral was diver- 
gent. 

In evaluating improper integrals we widely use expansions of 
their integrands into series of different kinds. If such an expansion 
yields a good approximation only near a singularity then the integral 
in question is broken into a sum of two integrals so that one of the 
integrals should be an improper integral taken over a subinterval 
containing the singularity and the other integral should be an 
ordinary (proper) integral. Then the first integral is evaluated by 
means of a series expansion whereas the second one is computed 
according to the methods of § 3. For example, to evaluate integral 
(52) we can perform the following operation: 


co a > E 

l yr] vant |* F(1+3) a= 
af da sf (eredet poe rl) aes 

0 a 


Vai Va 7 Vai 52 
p | waits (64) 


Here, in expanding the integrand, we have utilized Newton’s second 
binomial formula (IV.60). The number a >0 entering into (64) 
can be chosen arbitrarily. If we take very large a we shall encounter 

k r A dz 
difficulties in evaluating the integral f JE 
a very small a then the terms of the series whose sum is denoted 
by S will be very large, and it will be difficult to evaluate the sum. 
Let us take a = 2 and evaluate the last integral by means of Simp- 
son's formula (see Sec. 43). We divide the interval of inte- 


gration into eight parts. This yields 
2 
dz 
Æ == 1.402 
f Vel 
lating S with the same accuracy we obtain S = 1.402. Hence 
pee (2) is equal to 2.804 (let the reader verify all the calcu- 


lations!). 


but if we take 


30* 
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When we deal with a divergent integral, for instance, of form 
(41), we can encounter the problem of characterizing the behaviour 
of its “finite part” (42), which approaches infinity as N — oo, in 
a more precise manner. The investigation can also be carried out 
by means of expansions into series. For example, 


N a N 2 ing 

E OS 
0 a 

a N 

[yest 


2 8 14 
ay dz Des, eg Apel po ters E 
= aTa (z zr 3+HgT —...)dz= 
t R AE 
=3N8+C+—N saN 3+ 
” 1 5 11 
dz H i -3 Z mee 
= — fae ck 3 ied, et j = 
where C zam a 5a 8 zí . is a con 


stant which can be calculated by means of the technique applied 
to the previous example. 

An analogous problem can arise when we investigate a convergent 
integral. In this case integral (41) tends to a finite limit and it is 
the rate of its variation in the process of approaching the limit that 
can be investigated by means of expansions into series. For example, 
Teasoning as we did in performing calculations (64), we obtain 


N Eo E 
l>] = Iyan -a 1 


0 0 


Integrating by parts we find 
N œ 


0 0 N 
-|era beeps | iao ena 
0 N v 


7 1 
-ye +a quantity of the order of ET 


The above estimation can also be easily proved with the help 
of L’Hospital’s rule which we leave to the reader. To specify the 
expansion we can perform repeated integration by parts. 

17. Gamma Function. As an important example of an improper 
integral let us consider the non-elementary gamma function intro- 
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duced by Euler in 1729: 
r (p)= j e*2 dz (65) 


/ 0 


This expression is also called Euler's integral of the second kind. 

Integral (65) is improper since it has infinity as its upper limit 
of integration and, besides, it has a singularity at z = 0 for p < 1. 
We know that e~* tends to zero, as x —> oo, faster than any negative 
power of z and therefore the behaviour of the integrand at z = co 
does not affect the convergence or divergence of the integral. On the 


1 


other hand, we have ez?! ~ Am for z— 0 and therefore, by 


the beginning of Sec. 16, integral (65) converges for 1 — p < 1, 
i.e. for p > 0, and diverges for 1 — p > 1, i.e. for p < 0. Therefore 
we shall consider formula (65) for 0< p< œ. 

To deduce the basic property 


I (p + 1) = pr (p) (66) 


of the gamma function we integrate by parts: 


T (p+1)= j e~t- dy = j e* 2? dr= —e* x" | + J e*pa? dx 


which implies (66). 
Further, we readily find 


co æ 


r(1)= i adr = j e*dr= —e*|>=1 


If we now substitute, in succession, p = 1, 2, 3, ... into formula 
(66) we obtain 
PQ) =1Tr()=1, T (3) =2F (2) = 24, 
r (4) = 3F (3) = 3-2-4 ete. 


Generally, 
Pati Sn @=1, 2,3, ...) (67) 


Thus we see that the gamma function yields a representation of 
the factorial function. At the same time the gamma function also 
assumes certain values for non-integral values of the argument and 
therefore it extends the factorial function (see Sec. 1.15) from discrete 
values of the argument to the continuous range of the argument. 
The graph of the function is shown in Fig. 268. The equality 
T (+0) = +œ suggested by formula (66) is also illustrated in 


the figure. 
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In particular, formula (67) implies that 


0!=T (1)=1 
Further, we have 


1 1 AS AT peli 
(—s)!="(-74 1) =T (5) =V7=1.772 
We shall establish the last relation T +) = Vn in Sec. 18 [see 
formula (71)]: It follows that 


(Gir) =t (5) =e = 0500 


(2)1=r(2)=$r (4)= SV _ 1,329 ote. 


The gamma function can also be defined for the negative values of 
the argument but it is impossible to use formula (65) for this pur- 
pose since the integral diver- 
ges for p <0. However, we 
can rewrite formula (66) in 
the form 


r= 68) 


and use it for defining the 
gamma function for nega- 
tive p. 


SHARES 
HBR Ee, 


Fig. 268 Fig. 269 


Indeed, if —1 < p <0 then O<p+1<1 and therefore the 
right-hand side of (68) makes sense for —1 < p < 0. Hence, for- 
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mula (68) defines the values of the gamma function I (p) for such p. 
It should be noted that in this way we obtain I (p) < 0 for —1< 
< p < 0. Further, if we take —2<p<—ithn—1<p+1< 0 
and therefore the right-hand side is defined for these p by the prece- 
ding extension of the gamma function into the interval —1< 
<p <0. By the way, we see that T (p) >0 for —2 < p< —1. 
We next define T (p) for —3 < p < —2 in the same way etc. Hence, 
T (p) has been defined for the values of p of arbitrary signs, and 
formula (66) holds for all p. Applying formula (68) we conclude, 
in succession, that T (0) = +0, T (—1) = too, T (—2) =+% 
etc. The graph of the gamma function for the negative values of 
the argument is shown in Fig. 269. 

There are extensive tables of the gamma function. In particular, 
they can be found in [23]. 

18. Beta Function. The beta function, or Euler’s integral of the 
first kind, is defined by the formula 

1 
B (p,-4)= j ai (1— z) dz (69) 


0 


Here we must have p > 0 and q > 0 since otherwise the behaviour 
of the integrand as © approaches the upper and the lower limits of 
integration yields the divergence of the integral (why is it so?). 
It should be noticed that indefinite integral (69) is expressible in 
terms of elementary functions only for certain specific combinations 
of the values of the exponents p and q (see Sec. XIII.9). 

As we shall show in Sec. XV1.17, the beta function is expressed 
in terms of the gamma function by the formula 


_ Tera 
B (p: Q= T (p+4) (70) 


The formula ‘suggests an interesting corollary: if we put 
p=4=5 we obtain 
4 1 1 
E a a a te 
1 1 i 


dz dz È 
PON =2 | — = — re sin {2 Taa 
A Vat—2) | Vi-d—2 arc sin ( x) [j= n 


quality T (p) > 0 holding for p >0, we have 


T (=) =Vua (71) 


Hence, the ine 
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From this we can deduce the value of an important integral of the 
œ% 

form fer dx by using the substitution z = Vt: 
ù 


F 1? ; HRL a e a 

= t? B AEN we _ Vi 7 
fe#ar=z |e t 2at—> |e t? dt=5T(z)= 3 (12) 
0 0 

Many definite integrals containing power and exponential func- 
tions which cannot be expressed in terms of elementary functions 
for arbitrary values of parameters involved can often be expressed 
in terms of the beta function and, hence, in terms of the gamma func- 
tion. For instance, we have 


a aii D4 Pi 
{ sin” zaz= f @ 2(1—t) *dt= 
d 0 


z 
= r (tt) 
J eee) Coy ein 


(we have used the substitution sin z=} £ here), 
o0 4 

f zp-l pE yP1(i1—y)Pta dy 

Fapt Siy f CSE NEERA T 


0 
=| yt —ytdy=B (p,q) for p>0, q>0 (13) 
pet 
where z= Further, we have 
co co =—-1 


dz ped yP 
f (+a > | cmv 


4 1 f$ 
i =5B (p 1—$) for p>0, g>1 (14) 
where z= y” [we have utilized formula (73) here]. I ticul 
(74) yields the value of integral (52): aoe 


eer 38 (3. gs) = 5B (FB) - 
TENTE 
= l a 5) = 2804 


(the values of the gamma function have been taken from the tables). 
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19. Principal Value of Divergent Integral. It is sometimes advi- 
sable to attribute certain numerical values to some divergent inte- 
grals, in a conditional sense. For instance, such a situation may occur 
in investigating physical processes in continuous media. There are 
different ways of employing these values. Let us consider Cauchy's 
method. Suppose an integral of the form 


b 
j f (x) dx (75) 


has only one singularity inside the interval of integration, say at 
the point x = c. We cut off a subinterval containing the singularity 
which is symmetric with respect to c. Then we pass to the limit and 
put, by definition, 


c-& b 
v.p fie dz= lim [ j f (x) d+ f f(x) dz | (76) 


a a e+e 


Such a limit can exist even if integral (75) is divergent in the 
ordinary sense of definition given in Sec. 16. If the limit exists we 
call (76) the principal value (Cauchy's principal value) of (75). The 
notation v.p. in (76) is the abbreviation of the French valeur 
principale principal value. An integral of this type is often called 
a singular integral to distinguish it from proper and convergent im- 
proper integrals which are called regular integrals. Similarly, the 
principal value of an improper integral taken over the whole axis 
of the argument of the integrand is introduced, by definition, as 

œ N 


v.p: J f(x) de= lim af f(x) dx 


2 
, 


For instance, the integral j 


an 
of the integrand has a discontinuity at the point 
at which it approaches 


+ ae is divergent because the anti- 


derivative In | z | r l d 
— 0 belonging to the interval of integration 


infinity. At the same time its principal value 
2 Evens ; 2 
1 rere A A Me 
v.p. j Taaa [J = da f 5 dz] 


= lim fin || {2% -+1n|2|le1= 
e>-+0 


— lim [Ine—In1 + In2—Ine]=In2=0.693 
e>+0 
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exists because the summands + lns mutually cancel out before we 
pass to the limit. Another example is the integral í sin z dz: 


oo N 
v.p. j sin zdx= lim | sinzdz= 


Noo + N 


= lim (— cos z) w= lim [— cos V+ cos V]=0 
N=% N--co 


Thus, the principal value of the integral exists although the inte- 
gral diverges, that is it is singular. i 

It is apparent that not all divergent integrals possess principal 
values. : 


§ 5. Integrals Dependent on Parameters 


20. Proper Integrals. Take an integral of the form 
b 


r= |f (= à) dx (17) 


a 


whose integrand depends on a parameter (arbitrary constant) A 
besides the variable of integration z. The parameter À is regarded 
as constant in the process of integration but generally it can assume 
different values for which integral (77) is evaluated. And, generally 
speaking, the result of the integration can also depend on A, i.e. 
I = I (A). Such integrals occur in applications when an integrand 
can involve such parameters as masses, sizes etc. which are kept 
constant in the process of integration. For the sake of simplicity, 
we shall take integrands which only depend on one parameter 
although similar results are obtained in the case of many parameters. 
We first consider several formal examples: 


(x? + he) dr=4+}, 


: 4—cos 
sin ax dz = — and 


cheese chy che 


(s+41)a°dz=1 (s>—1) 


In this section we shall take proper integrals of form (77), that 
is the limits of integration and the integrand will be finite. 
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Let us consider some properties of these integrals. 

1. If the integrand is a continuous function of A fora < z <b 
then the integral J is also continuous in A. For example, this is 
implied by the geometric meaning of an integral as the area of 
a curvilinear trapezoid: if an infinitesimal variation of 4 yields an 
infinitesimal change of the form and of the sizes of the curvilinear 
side of the trapezoid (which is thé graph of the integrand) then the 
area should also gain an infinitesimal increment. 

It should be remarked that at the same time the function f may 
not be continuous in z and may have finite discontinuities. 

We sometimes encounter integrals whose limits of integration 
can also depend on a parameter: 

dA) 
10)= | fe, Adz (78) 

a(h) 
Then, for J (A) to be continuous, it is sufficient to set the additional 
condition that the functions a (A) and b (A) have no discontinui- 


ties. 
2. The Leibniz formula 
b 

A f Ale. Maz (19) 
suggests that it is permissible to differentiate integral (77) with 
respect to the parameter under the integral sign. The matter is that 
integral (77) is analogous to a sum of a great number of very small 
summands (see Sec. 2), each of them depending on A. Since the term- 
wise differentiation of a sum is permissible þecause the derivative 
of a sum equals the sum of the derivatives, formula (78) can be 
deduced by passing to the limit. ; 

Here we understand formula (79) in the simplest sense, namely, 
ntegral (77) is proper and integral (79) is proper 
onvergent. But there are cases when integral (79) 
diverges. Then formula (79) remains true provided we understand 


depends and as a variable on which both upper and lower limits 


of integration depend. 


derivative 0 r 1 
the derivatives of an integral with respect to its upper and lower 


limits (see Sec. 4). This implies 
bA) 
aL] pie Ade +760). NEN- a (a) (80) 


alh) 
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3. When we integrate an integral dependent on a parameter with 
respect to the parameter, it is permissible to integrate the integrand 
under the sign of the integral in (77), that is 


B b B 
firma=] (| t, A) dh) dx 
a Nhe 
The assertion is justified in the same way as property 2. 
21. Improper Integrals. We shall take the case of integrals of 
the form 


I= f f (z, A) dx (81) 


which have no singularities for finite z. Improper integrals of other 
types (see Sec. 16) dependent on parameters possess similar pro- 
perties. Obviously, we suppose that integral (81) converges. 
But contrary to Sec. 20, here we can have a situation when the 
dependence of an integral on a parameter 4 may not be continuous 
even if the function f is continuous in 4 which is a new fact for us. 
This is possible because even an infinitesimal variation of a function 
over an infinite interval of integration may lead to a finite variation 
of the value of the integral. 
For instance, we shall show in Sec. XVII.32 that 


i) 


sing m 
f E dzr= z 


It immediately follows that for A~>0 we have 
‘ _ ¢ sindz _ {sins ui 
na eae de g 
(we have made the substitution Az =s). At the same time we have 
I=0 for 4=0 and, for 4<0, we obtain 
_ fsin(—|ape  sin|A|z 
I (= | CDs gy f A a, — 2z 
ò 0 
Hence, in this example J (A) has a jump discontinuity at 4 = 0. 
One can find it strange that J (A) is discontinuous because the value 
of the improper integral 
N 
T(t) = lim { SR gg 


N=% i 
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is obtained as the limit of the values of the corresponding proper 

integrals, and each of these proper integrals is continuous in A. 

The explanation is that the limit of a sequence of continuous func- 

tions may not be a continuous function, as'it willlbe shown here. 
We now take the functions 


zx 


N . 
ie (A) ef f sin Àz d 
0 


whose graphs are depicted in Fig. 270 for small values of N and 
for large values of N. These functions are continuous in 4 but the 


I N=ec 


Zz ; 
2 he graph. of Jy (A) for 
small values of N 
The graph of Ly (a) for 
large values of N 
a 


4 


Fig. 270 


transition from the values which are close to -4 to the values 


which are close to = takes place on a small jinterval of variation 


of à for large N. The greater N, the smaller the interval. Therefore 
in the limiting process, for N = oo, this transition takes place 
on an infinitesimal interval of the A-axis, that is there appears a dis- 
continuity. 

The possibility of the existence of such discontinuities complicates 
the investigation of integrals of form (81) and, particularly, it 
makes it impossible to apply directly properties 2 and 3 in Sec. 20. 
Therefore we sometimes have to replace integral (81) by an integral 
taken over a finite interval from a to N and then pass to the limit as 
N —> co. Nevertheless, as we shall show, there exists an important 
particular case when such discontinuities are impossible. 

Suppose that the function f (x, A) satisfies the conditions 


oo 


Ii )|<@(2) (a<a<oo) and | o (2) dz < o% (82) 
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for all the values of A in question where ọ (x) is a certain function. 
Then, by Sec. 15, integral (81) converges for all A. In this case we 
shall say that the convergence of the integral is regular. For instance, 
this is the case for the integral 


o0 


sin Az 
z2 
i 4 
since 
|= ai and j A dri < oo 


' 1 
We can assert that if the integrand of integral (51) depends 
continuously on AÀ in the case of regular convergence the integral I 
: continuous in AÀ. In fact, the integral can be represented in the 
orm 


N o0 
POs f f(a, 2) de + \ f(x, à) dx 
a N 


The first summand is a proper integral and it is therefore [continuous 
in À. The second summand can be estimated as 


90 o0 oo 


|\ re, 1) da|<J |f (aes dy [dx < J q (x) dx 

N N N 
and, by condition (82), this integral becomes small for all the values 
of A simultaneously, for sufficiently large N (see Sec. 15). The vari- 
ation of the whole sum J (A) corresponding to a small variation 
of A must therefore be small which means that J depends conti- 
nuously on À. 

The properties of regularly convergent integrals are completely 

analogous to those of proper integrals described in Sec. 20. 


§ 6. Line Integrals 


_ 22, Line Integrals of the First Type. The third example in Sec. 4 
is an example of a line integral of this type. The general definition 
is formulated as follows. 

Suppose there is a curve (L) of finite length lying in space or im 
a plane. Let a quantity u be determined at each point of the curve. 
If we reckon the arc length s along the curve (L) from a certain 
point of the curve then we can regard u as a function of s: u = f (s). 
To form an integral sum we break up the curve (L) into small ele- 
mentary arcs. Let the points of division correspond to the values: 


a = Soe ale <a dn =P 


| 
| 
| 
| 
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where s = a and s = are, respectively, the values corresponding 
to the ends of the curve (L), and n is the number of elementary arcs 
Sr- L S L She Now we choose an arbitrary point op on each ele- 
mentary arc (that is s,4< Or < s) and form the integral sum 


n 
2 f (On) Asn, Ash = Sh — Sh-1 


To obtain the integral we must pass to the limit in the process 
when all the lengths of the elementary arcs are decreased unlimi- 
tedly (compare with Sec. 2), i.e. 


B n 
uds= § £0) ds= f f(s) ds=lim Y) f (on) Ash (83) 
(b) È) & k=1 
It is integral (83) in which the integration is performed with 

respect to the arc length that is called a line integral of the first 
type. For instance, in the above-mentioned example from Sec. 4 
we have 

M= f pds 

(L) 
Similarly, if a point describes a trajectory (Z) and a force F is 
acting on the point in this motion (F can be variable in the general 
case) then the work performed by the force is (compare with the 
corresponding example in Sec. 6) ve 
A= f F cos (F, 1) ds 

(L) 

where + is the unit vector in the direction of the tangent to (Z). 


Hence, this work is a line integral of the function f = F cos (F, ty 
with respect to the arc length. 

Formula (83) indicates that a line integral of the first type is 
a modification of the ordinary definite integral, and therefore many 
properties of the definite integral (in particular, properties 2-5: 
in Sec. 4 and those enumerated in Sec. 5) are automatically extended 
to the line integral. But at the same time it should be taken into 
account that As, and therefore ‘ds, is always considered to be posi- 
tive which implies that when we pass from (83) to an ordinary definite 
integral we must perform integration from a smaller value of s 
to a larger value. Therefore property 4 in Sec. 4 makes no sense 
in the case of integrals with respect to the arc length. 

A quantity u can sometimes be defined over the whole space. 
For instance, it may be represented by a function u = PAD Us) 
Then integral (83) can be put down in the form 


T= f f(a, y, 2) ds 
(L) 
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If a curve (L) is represented parametrically in the form x = z (8), 
y = y (t) and z = z (t) the evaluation of the corresponding integral 
can be performed according to the formula 


é aoe Ths 
I= È FO vO, oVa+y tad 
v 
where the values t = y and £ = ô correspond to the ends of the 
curve (L). Here we have taken the expression of ds given in 
Sec. VII.23. On the basis of Sec. VII.23, we can also write an inte- 


gral of the first type in the form f u |dr |. We also write it as 
(L) 


f u (M) ds where M is the variable point of the curve (L). The cor- 
(L) 
responding integral sum can also be written in the form 


n n 
2 upAsy = x u (Mp) As, 
where Mp is a point belonging to the kth elementary arc. 

Let us take an example illustrating an application of the line 
integral of the first type. It is well known in mechanics that if we 
are given a system of material points Mp (xp, yx) of masses mp 
lying in the z, y-plane where k = 1, 2, ..., n, then the coordi- 
nates of the centre of gravity of the system are defined by the for- 
mulas 

a Tit Matot... + Mtn MYA + Moyo+ ... + MnYn 
my met... mm | | Ye im mae. Pn 


Now suppose that we have a plane material line (Z) with linear 
density p which may be variable in the general case. The centre of 
gravity of the material curve can be found in the following way. 

Let us divide (mentally) the curve (L) into small elementary arcs 
As; and replace each of the arcs by the material point of mass m, = 
= prAs, lying on the arc. Thus we obtain a “discrete model” of 


the material line. The centre of gravity of the model has the coor- 
dinates 


n n 
2i TORASR DY yrprAsp 
en , Y= net (84) 
DA PhÂSh 5 PhåSh 
mi k=1 


Now if we pass to the limit, as the lengths of the elementary arcs 
are decreased unlimitedly, our model will turn into the continuous 
-curve (Z) whose centre of gravity, by formulas (84), will have the 
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coordinates 
| prds \ py ds 
_ dy ye= 2 
, c î p ds 
&) &) 

If we take a curve with constant linear density the formulas for 
the centre of gravity are simplified. Namely, cancelling out p = 
= const in formulas (85) we obtain the formulas for the so-called 
geometric centre of gravity of the curve (L): 

f zds { yds 
(L) 


(85) 


Tg... = a , veo. = (86) 


where Z designates the length of the curve (L). 
Let us compare the second formula (86) with formula (36) of the 
area of the surface of a solid of revolution. We see that 


S=2n f yds= L-2īyg. c. 
(L) 


In other words, if a plane curve rotates about an axis lying in the 
plane of the curve and not intersecting it then the area of the surface 


Fig. 274 


of revolution thus obtained is equal to the product of the length 
of the curve by the distance passed by its geometric centre of gravity. 
This is Guldin's first theorem named. after the Swiss mathematician 
P. Guldin (1577-1643) who applied the theorem. For instance, the 
theorem implies that the surface area of a torus (see Fig. 274), obtai- 
ned by the rotation of a circle of radius r about an axis lying in the 


34—0144 
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same plane as the circle and not intersecting it, is equal to 
S = 2nr-20R = 4n°rR 


where R is the distance from the centre of the circle to the axis 
of revolution. 

We shall come back to the notion of a line integral of the first 
type in Sec. XVI.1. 

23. Line Integrals of the Second Type. There are line integrals 
taken with respect to coordinates, besides the integrals of the first 
type. When forming such an integral (called a line integra! of the 
second type) we suppose that the curve (L) is directed, that is there 
is an indication in what direction the curve is described. If we have 
a non-closed curve we must indicate which of its ends is regarded 
as its initial point and which as its terminal point. To define the 
integrals we write, instead of formula (83), the formulas 


f udz=lim $ f(x) Atr, 
k=1 


i A i (87) 
\ u dy= lim $) f (ox) Ayx and | udz=lim D) $ (on) Azn 
(L) h=1 (L) k=1 


where Azp is the increment of the abscissa z along the kth elemen- 
tary arc etc. 

_ Integrals of type (87) are readily reduced to ordinary definite 
integrals. For example, if the curve (L) is represented in a parametric 
form then the values of u assumed at the points of (L) become depen- 
dent on ż, i.e. u becomes a function of t. Therefore 


6 
a 


j udz= f i EN 
(L) Y 


where the values ¢ = y and ż = ô correspond to the ends of the 
curve (Z). Consequently, the basic properties of the definite integral 
are extended to line integrals of the second type (properties 2-5 
in Sec. 4), But here property 1 in Sec. 4 also holds. It can be for- 
mulated as follows: if the direction of describing the curve (L) is 
reversed then integrals of type (87) are multiplied by —41. In fact, 
if the curve (Z) is passed in the opposite direction then all Az (and 
all dz) change their signs. The possibility of changing the sign 
of dz also indicates that the properties which are connected with 
integration of inequalities (see Sec. 5) no longer hold for the inte- 
grals of the second type. For instance, an integral of the form (87) 
of a positive function may not be positive, in contrast to the integ- 
ral of the first type. - 
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In the theory of differential equations (Sec. XV.6) and in the 
theory of vector field (§ XVI.6) we often use the combinations of 
integrals of type (87) of the form 


J (udz--v dy+w dz) = f udx+ f vay+ j w dz 
(Ly È) (L) (L) 
where u, v and w are given functions of z, y and Z. 
Some examples considered in Sec. 8 were virtually integrals of 
type (87). Indeed, formulas (29)-(31) can be rewritten as 


vee f yac= f edy=5 J @ay—ydz) (88) 
È) (L) È) 


on the basis of formulas zdt = dr and y dt = dy. 


For our further aims we shall need the integral | y dz taken along 
(L) 
a closed plane curve arbitrarily placed in space, as in Fig. 272. 
To evaluate the integral we project the curve (Z) on the 2Oy-plane 
and verify that 


5) ydz= A y dx (89) 


Indeed, the points of the curves (L) and (L’) which correspond to 
each other differ only in the values 
of the coordinate z which does 
not affect integrals (89). 

By (88), the integral on the right- 
hand side of (89) is equal to 


Es —s' 
(L') 


Now we shall use a well-known pro- 
perty of projections: if a plane 
figure is projected on another plane 
then the area of the projection is 
equal to the product of the area of 
the initial figure by the cosine of Fig. 273 
the angle between the planes (see 
Fig. 273). Actually, in this case the sizes in one direction are 
multiplied by cos œ whereas the sizes in the perpendicular direc- 
tion do not change. 

Thus, we have 


f y dx= — S cosa = —S cos (n, z) (90) 
(L) 
31* 
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where S is the area bounded by the curve (Z) and n is the unit vector 

in the direction of the outer normal to the plane of the curve (L). 

The direction of the vector n corresponds to the rule of descri- 

bing (L) according to the right-hand screw rule (see Sec. VII.11). 
The formula 


f z dy =S cos (n, 2) (91) 
(L) 


is proved in a similar way. 
Performing circular permutation of the coordinate axes (see 
Sec. VII.12) we deduce, from formulas (90) and (91), the formulas 


\ zdy = —S cos (n, 2), f zdz = — S cos (n, Y), 
È) (L) (92) 


~ 


l y dz = S cos (n, x) and f z dx = § cos (n, y) 
(L) (L) 


We note in conclusion that integrals of the forms 


raden Gow dy ma Hye 


(L) (L) (L) 
taken along any closed curve (L) are always equal to zero (the nota- 
tion $ is used for designating integrals taken along closed contours). 


Actually, let F (x) be an antiderivative of f (x). Then the first of the 
above integrals is equal to the increment of the function F (x) gained 
as the variable point describes (Z) and returns to its original posi- 
tion. The integral is therefore equal to zero. The second and the 
third integrals are treated similarly. 

24. Conditions for a Line Integral of the Second Type to Be Inde- 
pendent of the Path of Integration. Consider an integral of the form 


I= È IP (z, y, 2)dz+Q (z, y, 2)dy+ R (z, y, 2)d2] (98) 
(L) i 


where P, Q and R are some functions defined over the whole space 
with the coordinates z, y and z or in a domain lying in the space. 
Let (Z) be an arbitrary curve lying inside the domain. We shall 
also suppose that P, Q and R are finite in the domain, i.e. they are 
bounded and do not approach infinity at any point. There are such 
physical problems in which we encounter integrals (93) that only 
depend on the positions of the initial and terminal points of the 
curve (L) but not on the form of (L). This means that these integrals 
are independent of the way in which (Z) connects the initial point 
with the terminal point. In other words, in these cases we have 
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(Fig. 274) 
j (P dr +Q dy+ R dz) = f (P dz+ Q dy +R dz) = 
(L) (L”) 


= \ (P dxe+Qdy+Rdz)=... (94) 

aD) 
for any fixed points A and B and for all the possible curves L’, Joye 
L", ... connecting A and B (where L’, L”, L", ... lie inside the 
domain). We shall say that integral (93) is independent of the 
path of integration. Such an integral can express, for instance, 


Fig. 274 Fig. 275 


the work ofa field of force as a point is moving. Then condition (94) 
means that the work only depends on the positions of the initial 
and terminal points. 

For integral (93) to be independent of the path of integration, 
it is necessary and sufficient that integral (93) taken round any 
closed contour should be equal to zero, that is 


Q (P dz +Q dy + Raz) =0 (95) 
(L) 


for any closed contour (L). 

To prove the assertion we shall first suppose that condition (95) 
is fulfilled and that we are given paths (L’) and (L") with the same 
initial and terminal points (see Fig. 274). Let us construct the closed 
contour (L) traced from A to B along (L’) and from B to A along 
(L"), the path (L") being described in the direction which is opposite 
to its original direction in the corresponding integral (94). Now we 
take condition (95), break up integral (95) into two integrals taken 
along the two paths, according to property 3 in Sec. 4, and change 
the direction of integration for the second integral with the simul- 
taneous change of its sign, according to property 4 in Sec. 4. This 
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yields 
he j a A Ae. j (Pdx+Qdy+ R dz) = 
(L) «(L (L°) (L) 


pi (P dz+Qdy+ R dz) 
(L*") 


which implies (94), Reversing the preceding argument we can easily 
deduce (95) from (94). . 

For integral (93) to be independent of the path of integration, it 
is necessary and sufficient that the element of integration should 
be the total differential of some single-valued function of three 
variables, that is 

P dr + Q dy + R dz = du (96) 


where u = u (z, y, 2) is a certain function. To prove the assertion 
we first suppose that condition (96) holds. Then 


| Pdz+Qdy+ Ran- J u-u Bua 
(L) (L) 


where A and B are, respectively, the initial and the terminal points 
of the curve (Z). Consequently, the integral is independent of the 
path of integration. 

Conversely, let integral (93) be independent of the path of inte- 
gration. We fix an arbitrary point Mo in space and introduce the 
following function of the variable point M (z, y, 2): 


u(M)= | (Pdx+Qdy+ Raz) (97) 
UMoM 

where U MoM is an arbitrary curve connecting M and M and direc- 
ted from My to M. Condition (94) suggests that the function is single- 
valued, that is it assumes a certain uniquely defined value at each 
point M. To find du we first give x an infinitesimal increment dz. 
Then the point M passes to the position M’, as in Fig. 275, and the 
infinitesimal line segment MM" is parallel to the z-axis. The corres- 
ponding increment of the function u is equal to 


Axu=u(M’)—u(M) = f (P dr+Qdy +R dz)— 
UMMM’ 


y f (Pdr+Q dy+Rdz)= | (Pdx+Qdy+Rdz) 

UMoM mw 

But the coordinates y and z not varying along the segment MM’, 

we have dy = dz =0 in the last integral. Therefore Ae f P dz. 
MM’ 
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The segment MM’ being infinitesimal, we can consider P to be 
constant on it (P = const) to within infinitesimals of higher 
order of smallness. Therefore, passing from the increment to the 
differential and thus dropping infinitesimals of higher order we 
obtain 0,u = P dz. Similarly, we find that du = Q dy and 
ôu = R dz. Adding up the results we get (see Sec. [X.11) 


du = P dz + Q dy + R dz 


and consequently condition (96) is fulfilled. 
If we recall expression (IX.7) of the total differential we can write 
condition (96) in the equivalent form 


Mop, BaQ, aR (98) 


It follows that if integral (93) is independent of the path of inte- 
gration then we have 


ôP _ ðQ ôP _ ôR aQ _ ôR 


Indeed, conditions (98) imply 


ôP Pu 6Q Gu 

ty zay’ “Gz ™ By oz 
and therefore, based on the independence of the mixed derivatives 
of the order of differentiation (see Sec. 1X.15), we obtain the first 
equality (99). The other conditions (99) are proved similarly (let 
the reader complete the proof!). 

In the theory of vector field (Sec. XVI.27) we shall prove the 
converse assertion: if conditions (99) are fulfilled and if the domain 
in which the functions P, Q and R are considered is simply-connected 
then integral (93) is independent of the path of integration. A domain 
is said to be simply-connected if any closed contour lying in it can 
be contracted to a point by means of a continuous deformation 
without falling outside the domain. 

The whole space, a half-space, a dihedral or a polyhedral angle, 
the interior or the exterior of a sphere, the interior of a finite or 
infinite circular cylinder are examples of simply-connected domains. 
In contrast to it, the exterior of an infinite circular cylinder is 
a doubly-connected domain. Indeed, if we consider the contour (Z) 
depicted in Fig. 276 we see that it cannot be contracted to a point 
without falling outside the domain. Further examples of multiply- 
connected domains are the interior or the exterior of a torus and 
the whole space from which all the points belonging to a circle or 
to an infinite straight line are removed. In Fig. 277 we see 
a plane simply-connected domain and a plane four-connected 
domain. 
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Comparing conditions (96) and (99) we see that conditions (99) 
are necessary and sufficient for the expression P dx +- Q dy + R dz 
to be a total differential of a single-valued function u (z, Yy 3) 
defined in a simply-connected domain. It is possible to show that 
if conditions (99) are fulfilled in a multiply-connected domain then 


Fig. 276 Fig. 277 


the function u (z, y, 2) constructed in accordance with formula (97) 
satisfies condition (96) but may not be single-valued in the general 
case. 

An integral of the form 


| IP (e, y) de +Q (2, yay) 


taken along a plane curve is considered similarly. Accordingly, 
in all the expressions we must drop the terms containing z and dz. 
In particular, conditions (99) turn into one condition 

ôP _ aQ 

“Oy ~ Ox 
for such an integral. 


$ 7. The Concept of Generalized Function 


0 for —0<2<—f 


y =8y(2)=!N for ay <2< 4 (100) 


1 
0 for 3y < T<% 
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where N is a very large positive number. The graph of the function 
is shown in Fig. 278. As usual, the values of the function at the 


points of discontinuity z = + 5 are of no importance and therefore 


they are not indicated here. The delta function can be thought of 
as the limit of the function (100) as N + oo. But, strictly speaking, 
such a limit should be defined as 


sma 0 for —œ <r <—0 and +0 r< 0 
wf for —0< 1< +0 
with the additional condition 

f ô(x)dr=1 


which is implied by the fact that f Ôn (x) dr =1 because the 


area shaded in Fig. 278 is equal to unity. We cannot therefore speak 
about the graph of a delta function. The last y 
relation can also be rewritten in the form A 


+0 
f ôz(dz)=1 (101) 
0 


An approximate representation of the delta 
function must not necessarily be obtained by 
means of the discontinuous function (100). 
For example, we can also take the function 


ôn (2) =Z (1 + N°zx*)-* defined over the inter- 


val —o < z < oo for this purpose (let the 
reader investigate the behaviour of the graph 
of this function as V —> oo) and so on. Gene- Fig. 278 
rally, we can take every function whose val- 

ues are “concentrated” near z = 0. More precisely, for the delta 
function to be defined as the limit of the function ôy (x) (understood 
in the above sense) it is sufficient that dy(2) should satisfy 


the conditions ôn (£) > 0 for —œ < T< om, j ôn (z) dz > 0, 
—b b $ 

\ Sy (x) dz > 0 and f ôy (z) dx > 1, as N — œ for any positive 
= —a 

constant numbers a and b. 
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If we consider masses distributed along the z-axis and their linear 
densities then the linear density of a material point of unit mass 
placed at the coordinate origin turns out to be equal to the delta 
function. In fact, if we first consider this mass to be uniformly dis- 
tributed over the segment — sy a oF but not concentrated 
at the point z = 0 then its density will be of the form depicted in 
Fig. 278 (why is it so?). Now if N — oo then the mass will be con- 
tracted to the point in the limiting process and its density will 
become the delta function. 

Similarly, the function p (z) = mô (x — a) is the linear density 
of a mass m concentrated at the point x = a (we say “linear density” 
because we regard the mass as distributed along a line, that is along 
the z-axis in our case). The density of a point charge or of a point 
force can be represented in a similar way and so on. 

It is sometimes necessary to add together a delta function and 
an ordinary function. For example, the sum 


p (x) = po + mô (x — ay) + mô (x — ay) 


is the density of the combination of a uniformly distributed mass 
and two discrete mass points. Hence, the use of the delta function 
makes it possible to apply formulas which were originally deduced 
for continuously distributed masses to any combinations of discrete 
mass points and distributed masses. Moreover, from this point of 
view the distinction between discrete mass points and continuously 
distributed masses becomes inessential. The same refers to charges, 
forces and so on. 

When integrating an expression containing delta functions one 
must apply formula (101). For instance, if f (x) is a continuous func- 
tion in the interval a<2z<f and if œ < a < B then 


R ae, a+0 
| #(@)8 (72) de = f j (@)ò (z —a) dz + j f(x) ò (z— a) dz -+ 
a a a—0 


B a+-0 
+ | f(@)6(e—a)dz=0+ | f(a) S(r—a)dz+0= 
+0 a—0 


=f (a)-1=f(@) l (102) 


because the first and the third integrals on the right-hand side are 
equal to zero since the delta function is equal to zero on the corres- 
ponding intervals whereas f (a) substitutes for f (x) in the second 
integral since a continuous function can be regarded as being con- 


ton on an infinitesimal interval. Therefore formula (104) yields 
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A We must indicate the two-fold sense of an integral of the form 
į f (x) ô (x — a) dz. We can speak of the value of the integral only 


a 

if there is a specification as to whether the singularity of the delta 

function ô (x — a) (i.e. the point z = a) is included into the range 

£> integration or not. as we must write either 
or 


| 4 (œ) ô (œ — a) dz =f (a) f(z) 6 (z — a) dz = 0. 
g—0 


x +0 
Integrating the Dirac delta function from —oo to z we obtain the 
step function (the Heaviside unit function) 


0 for —w <r<0 


1 for 0< z< œ (299) 


e(z)= j syae—{ 


which also has many applications. Its graph is shown in Fig. 279. 
Such a function can be applied to describing a process of an instan- 
taneous application of a constant action, for instance, the process 
which occurs when a constant voltage is instantaneously applied 
to the terminals of an electric circuit. 

Thus, the integration of the delta function results in an ordinary 
though discontinuous function. The repeated integration yields 
a continuous function (let the reader investigate the form of the 
function). 

If we differentiate equality (103) we obtain 


ô (z) = e' (x) (104) 


This equality should be understood in the generalized sense. For 
instance, we can first substitute the oblique segment connecting the 
points (- e 0) and (a 1) for the vertical segment in Fig. 279 
(this inclined segment is represented by the dotted line). Then the 
discontinuous function is replaced by a continuous function whose 
derivative has the graph of the form shown in Fig. 278. Now passing 
to the limit, as N => oo, we obtain relation (104). 

Hence, we can say that a delta function is obtained by differen- 
tiating a discontinuous function having a finite jump. For example, 
let us take the law of motion considered in Sec. 1.13 which is con- 
nected with Fig. 6. The corresponding velocity is expressed by 


the formula 


f gt for 0<t<it* 
=) py for i*<t<o 
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and it has the jump s; (¢* + 0) — s; (t* — 0) = gt* — v at the 
point ¢ = z*. The acceleration is therefore equal to (gt* — v) ô (t— 
—t*) + ge (t* — t) (check it up!). The first term in the sum de- 
scribes the impact phenomenon. 

26. Application to Constructing Influence Function. Constructing 
an influence function (which is also called Green’s function after 
the English mathematician G. Green, 1793-1841) is one of the impor- 
tant applications of the delta function. We begin with an example. 
Let us consider the deflection h (x) of a beam subjected to a trans- 
versal external load of intensity p (x) (see Fig. 280 where we see 


Fig. 279 Fig. 280 


the graph or, as it is called, the diagram of the external load). We 
shall suppose that the load is not very large and we can therefore 
apply the linear law of elasticity which implies that if we combine 
external forces the corresponding deflections add up. 
at Let us imagine that we have unit force applied at a point Ẹ [whose 
intensity” is ô (z — §)]. Then the beam will be deformed in a certain 
way. We denote the deflection at a point z under the action of unit 
force applied at the point £ by y = G (z, &). It is the function 
G (x, §) that is called the influence function of our problem. We 
shall show that if the function is known it is easy to determine 
A ag under the action of an arbitrary load of intensity 
a 
Indeed, let us consider the portion of the load on the infinitesimal 
segment of the axis from the point € to the point Ẹ -+ dé. This load is 
equal to p (§) d&. Therefore the deflection under this load at a point 
x is equal to G (z, §) p (§) dé because the linearity we have men- 
tioned above implies that if a load is multiplied by a constant the 
_ corresponding deflection is multiplied by the same constant. Adding 
together all these infinitesimal deflections we obtain the resultant 
deflection (see Fig. 280): 
l 


h(2)= | G(x, ) p@ ak (105) 


0 
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Now we proceed to describe the general scheme for constructing 
influence functions. Let an external action be applied to an object 
and let it be described by a function f (x) defined over an interval 
a<a<b (the role of f (x) was played by the function p (x) in 
our previous example). Let a function f(x), a<x <b, describe 
the result of the action (such a result was described by the function 
h (x) in the example). Thus, every given function f is transformed 
into a new function f. By Sec. XI.6, such a law of transformation 
of a preimage which is the function f into the image which is the 
function f is called an operator. For instance, the operator of differen- 
tiation D transforms functions according to the law Df = f' and 
thus we have D (sin z) = cos z, D (z) = 32° etc. Here sin x is 
a preimage which is transformed by the operator D into the image 
cos z etc. The concept of an operator is analogous to the concept 
of a function (see Sec. 1.41) but a function transforms numbers into 
numbers (i.e. the values of an argument into the corresponding values 
of the function) whereas an operator transforms functions into 
functions (or, generally, objects of any kind into objects of the 
same kind or of another kind). 

Let us denote the operator which transforms a function T (x) 
describing an external action into the “response” function f (2) 
by A, that is Af = f. We shall suppose that there is a linearity law 
here or, as we say, the superposition principle: when external actions 
are added together their results are also added up. This law which 
can be written in the form 


A (fi +f) = Afi + Afa 


is often applied when the external actions are not very large. An 
operator possessing this property is said to be a linear operator 
(compare with Sec. X1.6). (Let the reader verify that the operator D 
is linear.) From the law of linearity we can deduce that if an external 
action is multiplied by a constant the corresponding result is also 
multiplied by the constant, that is 


A (Cf) = CAf 


where C = const. (Try to justify this rule first by taking positive 
integral C and then by putting Cay gp C== where 
m and n are integers, C = 0 and, finally, by passing to negative C.) 

We now denote as G(z, Ẹ) the result of an external action des- 
cribed by the delta function ô (z — E) (regarded as a function of x 
for every fixed value of §), that is 


A [6 (z — $] = G (z,§) 
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Any function f(x) can be represented as a sum of “impulse functions” 
of the form depicted in Fig. 281 below (the corresponding summation 
is depisted in Fig. 281 above). Each of these functions has a singu- 
larity only at one point (when we pass to the limit as d — 0) and 
is therefore equal to f (Ẹ) d&6 (x — §) (why is it so?). Thus we have 


f (2) = Df È) 4&6 (1— $) 


This is in fact formula (102) written in another form. It follows that 
Alf (a) =A [X 7 © d (z— 5] = DAI) 46 (2—91 = 
=J f E A lò (@—Hl= È f E) dE G (z, $) 


But when dé is infinitesimal this sum turns into an integral, and 
finally we obtain 
b 


Ati @l=\ (e914 (106) 


a 
[compare with formula (105)]. 

An influence function can be determined theoretically in simpler 
cases (for instance, see the end of Sec. XV.16). In more complicated 
cases it can be found experimen- 
tally by performing necessary 
measurements (for example, we 
can measure the deformation of 
an elastic system caused by the 
action of unit force). The essen- 
tial thing is to verify the linea- 
rity of the system in question, 
-» that is to find out whether the 

superposition principle is appli- 
cable. The applicability can be 
deduced theoretically or confir- 

4 med by an experiment. Of course, 
y=flé)déd(x-é) not all systems are linear or even 
approximately linear. There are 

TÈ i such systems that are essentially 
non-linear, and linear methods 
are inapplicable to such systems. 

It should be noted that the 
functions f and f can be defined 
over different intervals. Moreo- 
ver, the independent variables x 
N ; and Ẹ entering into formula (106) 
may have different physical meanings. The independent variable § 
is sometimes interpreted as time. In this case the influence function 
describes the result of applying unit impulse at the moment &. 
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It sometimes happens that a system in question is linear only 
for infinitesimal external actions. But an action whose density is 
described by a delta function cannot be considered to be small. 
Then the influence function can be defined by the formula 


G (z, $= limp A [PS (2—B)] 


For instance, in our previous example we can find the defection 
corresponding to a small force P and then divide the result by P. 
In such circumstances formula (106) is applicable only to the case 
of small (or, more precisely, infinitesimal) external actions f (z). 

27. Other Generalized Functions. We now take the approximate 
representation 6, (x) of the delta function given in Sec. 25 for 
a continuous model and differentiate it. We obtain an approximate 
representation of the derivative 8’ (z). It can be seen that 6’ (x) 
takes on the values of both signs and has singularities that are 
“more acute” than those of the delta function ô (z). 

We have shown that the 6-function describes the density of 
unit charge located at the origin (see Sec. 25). Its derivative 
8’ (x) describes the density of a dipole placed at the same point. 
In fact, we can obtain such a dipole if we put the charges —g and g 
at the points z = 0 and = l, respectively, and then pass to the 
limit, as J tends to zero, retaining the constant value of the quantity 


the form $2) —8 (2) 
z—l)— ô (x 
gd(c—l) —95 (2) = —P— —,___- 


and therefore, after the passage to the limit for J 0, the density 
becomes equal to —pd’ (2). r 4 
Integral involving ô’ (x) are evaluated with the help of integra- 
tion by parts. For instance, if f (æ) has a continuous derivative and 
ifa<a<f then A 
B 
=B ' r 
Í 102) 8 (@—a) dr =i (2) 8 (e—a) [$ ji-a fe de= F(a) 
a 
Here we also give an example of incorrect calculations: 
B a+0 
|r 8! (z—a) dr= į f(x) 8 (æ —@) de= 


4 a+0 


=} (a) f 8! (e—a) dz= f (a) ô (2—1) otto 


ba) 
This is wrong because 6’ (£ — a) has a very “acute” singularity and 
therefore the substitution of f (a) for f (2) results in an approximation 
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which is not sufficiently accurate. The correct evaluation of the 
integral must be performed as above. 

Generalized functions can be classified according to the number 
of successive integrations which must be performed in order to 
obtain a continuous function. If we assume this classification then 
continuous functions can be regarded as generalized functions of 
zero order; functions with finite jump discontinuities and ordinary 
functions with integrable singularities placed at finite distances 
(for example, from the origin) described in Sec. 16 are generalized 
functions of the first order. The function ô (z) is the simplest example 
of a generalized function of the second order (see Sec. 25) whereas 
6’ (x) is of the third order and so on. The differentiation of a function 
increases the order by unity and the integration reduces it by unity. 

When a function with non-integrable singularities is interpreted 
as a generalized function we must indicate the function of order 0 
or 4 from which the function in question can be obtained by diffe- 
rentiation because there can be many such functions. For instance, 


the function a having a non-integrable singularity at x = 0 can be 
regarded, for = 0, as being equal to 

(In |x |)’ or to (In | x | + e (a))’ (107) 
where e (x) is the step function (see Sec. 26) because we have e’ (x) = 


= 0 for z0. But functions (107) differ by e’ (z) = 6 (a) and 
their properties are different. In the theory of generalized functions 


it is preferable to use form (107) for designating the function + 
because after one of these formulas has been chosen (or some other 


formula of this kind) the function = is represented in a unique manner 
as a generalized function [of course, formulas (107) yield different 


representations]. Similarly, the function sA can be regarded as 
n 2 i 
—(In |x I)" and so on. Functions whose rate of growth at their points 
of discontinuity is greater than the rate of growth of any power 
1 


function (for instance, the function el*l) should be excluded from this 
classification. By the way, these functions are not of great impor- 
tance for applications. It should also be noted that the growth of 
a function for z—» +co is not restricted here. 

It turns out that the use of generalized functions makes it possible 
to extend most of the rules connected with differentiation of diffe- 
rent formulas without usual restrictions imposed on the behaviour 
of the functions. For instance, the Leibniz formula for differenti- 
ation of an integral with respect to a parameter (see Sec. 20) becomes 
true for any type of convergence of improper integrals in question 
and the like. 


CHAPTER XV 


EDUCA TN 
. 
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A differential equation is an equation connecting two or more func- 
tionally dependent variables and their differentials or, which is the 
same, their derivatives. The problem of forming and solving these 
equations is widely encountered in physics and engineering. The 
process of solving a differential equation is called integration of the 
differential equation. 


§ 1, General Notions 


1. Examples. We have already dealt with some simple differential 
equations in our course. For instance, take equations (XIV.22). 


If the force F (s) varies according to Newton’s law (i.e. F = 


see Sec. XIV.14) we can rewrite the equation in the form 


k dA k 
dA =- ds or EEA (1) 


where the work A = A (s) is an unknown function of the displace- 
ment s. Equation (1) is a differential equation, and the unknown 
function A (s) is found by means of integration. 

Another example is equation (XIV.27) which can be rewritten as 


dh o 
a SRS V 2gh (2) 


where h = h (t) is an unknown function. 

In addition, let us consider an example of elastic vibrations of 
a material point of mass M *about an equilibrium position (see 
Fig. 282). Here the unknown function y = y (¢) expresses the law 
of vibrations. For simplicity’s sake, let us suppose that there is 
a linear law of elasticity here, that is the elastic force is directly 
proportional to the deviation of the point from the equilibrium 
position. Then the force is equal to 


32—0141 
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where k is the stiffness factor. If there are no other forces then, 
according to Newton’s second law, we have 


dy 4 
i it alma . (3) 
Thus, the differential equation of the law of vibrations is of the form 
MAL} ky=0 (4) 


where y = y (t) is the unknown function. 

A differential equation of a problem in physics or engineering 
is always deduced on the basis of a certain law describing a rela- 
tionship between infinitesimal variations of quantities in question 
(a differential law). After the differen- 
tial equation has been integrated we get 
an integral law describing finite varia- 
tions of the quantities. The deduction 
y of basic differential equations in a certain 
branch of science is a very important 
— operation because it essentially determi- 

Fig. 282 nes the course of further development 
of this branch. 

2. Basic Definitions. A differential equation is usually taken 
in a form which connects an argument (or several arguments) and 
an unknown function (or several unknown functions) with its deri- 
vatives. Even if we originally have a relationship between diffe- 
rentials it is possible to transform it into a relationship between the 
derivatives [see formulas (1)]. If the unknown function in a diffe- 
rential equation depends on one variable the differential equation 
is called ordinary (for example, the equations in Sec. 1). If otherwise 
the equation is called a partial differential equation (think why it 
is called so), In this chapter we shall deal only with ordinary diffe- 
rential equations. 

The highest order of the derivative of the unknown function 
entering into an equation is called the order of the differential 
equation. Thus, equations (1) and (2) are first-order equations whe- 
reas equation (4) is a second-order equation. The general form of 
a differential equation of the nth order is 


F (z, Y, y’, Yisstnetss y)) Xi) (5) 


where y = y (x) is the sought-for function. In particular cases 
a function F may not depend on some of the quantities entering 
into (5). For example, equation (4) does not contain the independent 
variable and the derivative of the first order. 

A function is called a solution of a differential equation if it redu- 
ces the equation to an identity when substituted into the equation. 


Equilibrium 
position 


Elastioe 
force 
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Even the simplest examples indicate that a differential equation 
has infinitely many solutions. For instance, taking a simple equation 
of the form 


y=, y = y (2) (6) 
we immediately find, by integrating, that 
x 
=—+C (7) 


This is the general solution of equation (6). It contains an arbi- 
trary constant Fand is the set of solutions containing all solutions 
of the equation. Making the arbitrary constant assume concrete 
numerical values we obtain particular solutions of equation (6): 


z3 z Ead 
LA FO y=3 +6, vas etc. 


If we take an nmth-order equation of the form 
yM =z, y=y (a) 


then its general solution can be found by means of n subsequent 
integrations and therefore it contains n arbitrary constants. Simi- 
larly, the general solution of an equation of the form (5) also con- 
tains n arbitrary constants, i.e. it has the form 


yf SON Cr Ca cei Ca) (8) 
We often obtain the general solution in an implicit form 
D (x, Y Cis Cy ..” Cn) =0 (9) 


Relations (8) and (9) are also called general integrals of equation (5). 
Particular solution can be obtained from (8) or (9) if we make each 
of the arbitrary constants C,, Ca, ..., Cn take on a certain con- 
crete numerical value. The graph of every particular solution is 
called an integral curve of the differential equation in question. 
Substituting these concrete numbers for the constants Ci, Cy, ... 
. ..; Cn into equation (8) or (9) we get an equation of the integral 
curve. 

To isolate a unique particular solution from the general solution 
we must set some additional conditions. Such conditions are often 
taken as so-called initial conditions. If we have a process developing 
in time then such conditions are a mathematical expression of the 
initial state of the process. 

For example, take the process of vibrations considered in Sec. 1. 
The physical meaning of the problem makes it clear that a particular 
(concrete) vibration will be completely specified if we set the values 


32* 
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of the initial deviation of the point from the equilibrium position 
and of the initial velocity. Therefore the initial conditions for 
equation (3) are of the form 


y=Yo and CAE Tor eto (10) 


where yo and vo are the given values. In the general case of an equa- 
tion of form (5) the initial conditions are 


U= Yo Y=’) --- yr) = (ye), for z =z, (11) 


where the values yo, (y’)o, ..-, (y™—")9 are given. The general 
solution [for instance, of form (9)] containing n arbitrary constants, 
it is possible (at least theoretically) to determine the values of the 
constants taking advantage of the n relations we have set. Thus, 
generally speaking, the number of additional conditions (11) is 
sufficient for determining the arbitrary constants and specifying 
the particular solution. It appears natural, from the physical point 
of view, that if a differential law controlling the development of 
a process and an initial state of the process are given then the pro- 
cess itself is completely specified. 

Condition (11) for an equation of the first order of form (6) means 
that for a certain value x = x) we must assign a value y = Yo. 
For instance, let it be necessary to isolate a solution for which 


y (1) = 2. Then (7) implies 2 =Ẹ + C, i.e. C= 3. Hence, the 
sought-for particular solution has the form 


845 
Uo 


The problem of finding a particular solution of a differential 
equation when certain initial conditions are given is called the 
Cauchy problem (initial-value problem). 

As we shall see in Sec. 7, there are some differential equations 
which possess: so-called singular solutions in addition to the parti- 
cular ones contained in the general solution, that is solutions that 
do not enter into the general solution. 


§ 2. First-Order Differential Equations 


3. Geometric Meaning. The general form of a first-order diffe- 
rential equation can be written as 


F (z, y, y') =0 (12) 


where y = y (z) is the unknown function. For simplicity’s sake, 
we first suppose that the equation is solved for the derivative of the 
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unknown function. Then the equation takes the form 
y’ =f (z, y) (13) 
According to Sec, 2, the initial condition for equation (12) must 

be written as 

y = yo is given for z = x (14) 
To get a geometric interpretation of equation (13) let us introduce 
a plane with Cartesian coordinates x and y. Then every particular 
solution is represented by a curve (the integral curve) lying in the 


x, y-plane, these curves being yet unknown. But if we take an 
arbitrary point M (æ, y) in the plane we can compute the value 


y 
ghi 
. y 
pe es ae 
M(z,y) £ 
Pal ace A aaa 
ai cone ar i | Mo (Xo, Yo) 
a I P 
i ERA ital 
tan a=flx, y) 0 ci 
Fig. 283 Fig. 284 


of f (x, y) which, as it is prescribed by equation (13), must equal 
the slope of the tangent (see Sec. IV.3) to the desired curve at the 
point M (x, y) provided the curve passes through the point M. 

We can therefore perform the following procedure: let us draw 
(mentally) a small line segment with the slope tana = Í (z, y) 
at each point M (zx, y) (see Fig. 283). Practically we can draw only 
a number of such segments but theoretically we can regard the 
segments as drawn through all the points. Thus we obtain the so- 
called direction field in the plane defined by equation (13) (the gene- 
ral concept of a field was introduced in Sec. [X.9). Hence, we see 
that the integral curves of equation (13) must pass through the 
points M (x, y) of the z, y-plane in such a way that each curve 
should touch the segment at each point it passes through. 

Thus, equation (13) defines a direction field in the x, y-plane. 
On the other hand, initial condition (14) defines a point Mo (£o, Yo) 
through which the desired integral curve should pass. From the 
geometric point of view it is clear that the above condition com- 
pletely specifies the integral curve (see Fig. 284). In other words, 
initial condition (14) being given, equation (13) has a completely 
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specified unique solution. A more detailed investigation carried 
out by Cauchy shows that the above assertion holds provided the 
function f is continuous at the point Mo and has a finite value of its 


derivative < at the point (Cauchy’s conditions are in fact sufficient 


for the existence and uniqueness of the solution but they are not 
necessary; certain cases when the above conditions of Cauchy's 
theorem are not fulfilled will be considered in Sec. 7; as it will be 
seen, this may yield non- 
fulfillment of the unique- 
ness of a solution). 

These’ considerations 
concerning a direction 
field can be illustrated 
by means of the well- 
known experiment with 
iron filings placed in 
a magnetic field. The 
arrangement of the filings 
demonstrates a direction 
field whose integral cur- 
ves are the so-called mag- 
netic lines of force. 

The geometric mea- 

Fig. 285 ning of equation (13) we 

PR is the direction of the feld have just discussed enab- 

5 les us to construct (ap- 

proximately) the integral curves of the equation. To do this we 

depict the directions of the field at as many points as possible and 
then draw the curves according to the directions. 

Practically, in constructing a direction field it is more convenient 
to take the points belonging to the so-called isoclines instead of 
choosing the points arbitrarily. An isocline is a locus of points 
at which the field has the same direction. We can derive an equation 


of an isocline if we equate the right-hand side of equation (13) to 
a constant. Thus we write 


f @; y) =k 


where k is the slope of the field corresponding to the isocline we 
have taken. 


For instance, let us take the equation y’ = x + y. 
Equating the right-hand side to the constants —2, — Lm —1,— a , 
0, 1 and 2 we obtain the corresponding isoclines which, in this case, 


are straight lines (i.e. the lines z + y = —2 etc.). These isoclines 
are depicted in Fig. 285. The direction of the field is indicated on 
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each of the isoclines. To find the direction of the field on a given 
isocline corresponding to the slope k of the field we can construct 
the rectangular triangle PQR having the base PQ = 1 parallel 
to the z-axis and the altitude QR = k. Then the side PR will indi- 
cate the desired direction. There are also several integral curves 
in Fig. 285 which are drawn in accordance with the directions. We 
see that the straight line z + y = —1 is one of these curves. We 
also see that the locus of the lowest points of the integral curves 
is the straight line s + y = 0. In the case of an equation of general 
form (13) in order to find the loci of the highest and of the lowest 
points belonging to the integral curves it is necessary to construct 
the isocline f (æ, y) = 0. (Think how we can find the locus of the 
points of inflection of integral curves in the general case.) 

4. Integrable Types of Equations. We say that a differential equ- 
ation is integrable by quadratures if its general solution is expres- 
sible in an explicit or implicit form which may contain quadratures 
(i.e. indefinite integrals) of some known functions. We consider the 
integration completed even if these quadratures are not in fact 
computed (the theory of the integral given in Chapters XIII and 
XIV deals with the methods of integrating functions). As we know, 
a quadrature may not be expressible in terms of elementary func- 
tions but nevertheless in this case we shall as well consider the 
integration of a differential equation completed. Unfortunately, 
even most of the simplest equations are not integrable by quadra- 
tures and it is therefore necessary to investigate them by means 
of some other methods which will be discussed later. But there exist 
certain classes of differential equations that are integrable by quad- 
ratures and we shall study them here. 

1. Differential equations with variables separable have been already 
dealt with (see Sec. XIV.7). They have the general form 


dy 1 

A = f(z) oU) (15) 
and their general solution is expressed by the formula 

ays dx-+C 16 

J P) J Jia oe 


We have written the arbitrary constant C (which we usually regard 
as being included into the sign of indefinite integral) in order to 


stress that the constant enters into the general solution. Thus, we 


see that equation (15) has been integrated by quadratures [this 


is expressed by formula (16)1. p 
7 Peak Peabo: take a simple equation of the form 


dy _ 
crimes 
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Separating the variables and performing integration we obtain 
H = 2x dz, nie.) yl —er tb", y— Eele (17) 


The answer can be written in a different form if we notice that 
the expression +e? can also be regarded as an arbitrary constant. 
Therefore y = Ce**. It is apparent that the symbol C in the last 
formula designates a new constant which is different from the one 
entering into formula (17). 

To avoid the change of notation we could write 


ln |y | =z +lnC 
while integrating equation (17), since In C is also an arbitrary 
constant. Then, raising, we get 
ly |= Ce, y=+Ce** or, simply, y = Ce 


because the signs + and — may also be included into C, that is C 
in the expression y = Ce** is allowed to take on values of arbitrary 
signs. Further we are going to perform transformations of this kind 
without specific stipulations. 


We can similarly construct the general solution of the equation 
P (a) Q (y) dz + R (x) S (y) dy = 0 
which is also an equation with variables separable. 
2. Equations homogeneous in the argument and in the unknown 
function. An equation of form (12) is said to be homogeneous in « 


and y if its left-hand side is a homogeneous function in x and Y, 
that is if 


F (tx, ty, y) = tF (z, y, y’) 


(the general definition of a homogeneous functi i in 
Son a g nction was given 


Then equation (12) can be rewritten in the form 
y , , . , 
F(e-t,24, y')=0, F (1,4, y') =0, ie. F(1,4, y’) =0 
Solving the last equation for y’ we obtain 
yale y 
y'=(4) (18) 


This equation is easily integrated by means of the substitution 
Tah ysua, y'=ws}u 
where u = u (z) is a new unknown function which 


replaces y. Sub- 
stituting y = ux into (18) we derive z 4 


, d ‘ 
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and thus the variables have been separated. To complete the inte- 
gration we must solve the last equation and then return from w 
to the original unknown function y. 
Let us take a more general equation than (18), namely, 

dy ax + by+-¢ 

ar? ( many +p ) 
Suppose that the binomials az + by and mz + ny are not propor- 
tional to each other. Then we can make a substitution of the form 
x =a, +a and y = y, + P where the parameters œ and f should 
be chosen so that there should be no absolute terms in the denomi- 
nator of the resulting fraction. After that we substitute u for the 


ratio a and thus obtain an equation with variables separable. 
1 
3. Linear equations. An equation of form (12) is called linear if 


its left-hand side is a linear function in the unknown function and 
its derivative, i.e. an equation of the form 


a (z) y' + b (z) y + c (z)=0 
Dividing the equation by a (z) we obtain 
y +p@y=f@ (19) 
c 


where p = È and f 


Equation (19) is called a homogeneous linear equation if f (x) 
is identically equal to zero. If otherwise the equation is called non- 
homogeneous. To solve the general non-homogeneous equation of 
form (19) we first investigate the auxiliary homogeneous equation 


z + p(z)z=0 (20) 
which corresponds to (19) and is obtained from (19) by dropping 
the non-homogeneity term f(x). The variables in equation (20) 
are easily separated: à; 


Lam — p(z) 3; <= —p(x)dz, In|2|= — | p(e)dx+inC, 
T 


z=C exp [—J e@ae]=Ca (21) 


where z, is obtained from z if we substitute C = 1. Thus, z, is a par- 


i ti f equation (20). ey 5 
are Pere coe found we seek a solution of equation (19) in 


h 
a y=9@)% (22) 


where z; is the same as in formula (21) and ọ (z) is a function yet 
unknown. Such a replacement of the former arbitrary constant 
entering into formula (21) by a function entering into formula (22) 
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is an application of a general method called the method of variation 
of arbitrary constants (parameters) which was introduced by Lag- 
range. 

Substituting (22) into equation (19) we receive 


gy + pz + pen =f, 9% + o (z + PA) =f 


The function z, satisfying equation (20), the expression inside the 
brackets equals zero. Therefore, 


g (z) = i , O(a)= f ae ae: 
f 


YE (2) j A 2 dz -+ Cz, (x) 
The last expression is the general solution of equation (19). The 
first term of the expression can be obtained by substituting c- SN 
Hence, the first term is a particular solution of the equation. 

Thus, the general solution of non-homogeneous linear equation (19) 
is equal to the sum of a particular solution of the non-homogeneous 
equation and the general solution of the corresponding homogeneous 
equation. The equation 


y’ +p (2) y= f (2) y” 


is called Bernoulli’s equation. It can be reduced to a linear equa- 
tion by means of dividing both sides by y” and introducing the 
change of variables of the form y"—w (check it up!). 

There are some other types of equation integrable by quadratures 
(see [24]) but most of the differential equations are not integrable 
by quadratures. For example, in the general case we cannot integ- 
rate by quadratures the equation y’--y?+/(z) which is called 
Riccati’s equation (named after J. Riccati, 1676-1754, an Italian 
mathematician). Riccati’s equations are applied to some problems. 
There are many other simple differential equations that are not 
integrable by quadratures. 

5. Equation for Exponential Function. Let us consider the equation . 


a =ky (k= const) (23) 


which indicates that the rate of change of the quantity y related 
to the quantity z is proportional to the current value of y. Hence, 
if y (x) >0 then y increases for k œ> 0 and decreases for k < 0. 
Such a simple relationship between a quantity and its rate of change 
is often used as a first approximation in investigating various pro- 
cesses. i 


The variables in equation (23) can be separated which yields 


d 
q= kdr, In|y|=kr+1nC, y =C" 
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In addition, if there is an initial condition y (xo) = yo we obtain 
Yo = Ceb™o, C = yot, ie. y = yog" (==) (24) 


Thus, the general solution of equation (23) is an exponential 
function (see Sec. 1.27). It is characteristic of this solution that if 
we make x assume the values forming an arithmetic progression 
with common difference Az then the corresponding values of y form 
a geometric progression with common ratio e*4*, We can easily 
find the value of Az for which y increases or decreases twice for 
every step Az. Indeed, this being so, we must have 

|kAr|=In2, i.e. Ar= iF (25) 

If k > 0 and yo > 0 formula (24) describes the so-called law of 
exponential growth of the quantity y which is characteristic of 
different chain reactions. As an example, let us consider the process 
of reproduction of bacteria in a culture medium when their number 
is not too large. We suppose that all the bacteria reproduce more 
or less independently. Then we see that the rate of growth of the 
number u of the bacteria measured in certain units is proportional 
to the number, i.e. 

= hu, u = upek(t—to) 

There are many problems of this kind that can be investigated 
in a similar way, the problem of calculating the growth of a capital 
deposited in a bank being one of them. 

If k< 0 then formula (24) expresses an exponential decrease 
of a quantity y. For example, this is the case when we investigate 
the process of a radioactive decay. In fact, let us denote the mass 
of a remaining (not disintegrated) radioactive substance by m. 
If we suppose that different parts of the mass are decaying indepen- 
dently then we conclude that the rate of decay of the mass is pro- 
portional to the current value of the mass, that is 


im — — pm, m = me~ ?”t-to) 
4 In2 
In particular, note that after the elapse of time At = = the 


value of m becomes half as large, this being suggested by formula 
(25). The time interval Az is known as the half-life period (or simply 
half-life of the radioactive substance). For instance, At is appro- 
ximately equal to 1.8 x 10° years for radium. This means that if 
an initial mass of radium is not replenished then in 1.8 10° years 
we shall have half of the initial quantity and after another 1.8 10? 
year period passes we shall have a quarter of the initial mass etc. 
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There are many phenomena, such as the decrease of the atmosphe- 
ric pressure with the growth of the altitude or the discharge of a 
capacitor through a resistance, which are investigated in a similar 
way. 

The equation of a problem can sometimes be transformed to form 
(23) by means of some simple techniques. For example, as it was 
shown in Sec. VIII.7, the electric current flow i in a circuit con- 
sisting of a resistance R and an inductance L satisfies the equation 


LÆ +Ri=u (26) 


when a constant voltage u is applied to the terminals of the chain. 

Equation (26) is a non-homogeneous linear equation which can 
be integrated (solved) by means of the method described in Sec. 4. 
But it is easier to transform the equation in the following way: 


- u 
pe moe ajoa E-i) 
whence 
E A 


_ We obtain a simpler case when there is no initial current in the 
circuit. Indeed, let t) = 0 be the initial moment. Then u (to) = 
= u (0) = up = 0 and we receive the formula 


R 
2 a ie 
, i= + (ok). 


3 u aR 9 
ae) (27) 
The graph of relationship (27) is shown in Fig. 286. We see that 


the current increases and exponentially tends to the limiting steady- 
l 
state value = as t— oo. This value can also be easily found from 


equation (26) if we take into account that 5 — 0 as ¢— oo in the 
process of current rise. Therefore, in the limit we have Ri = u, 
. T u 

ie. i =. Thus, when the current becomes practically steady- 


state ae whole voltage drop is on the resistance. During the time 
perio 


In2 G 
uae elke 
L 


the deviation of the current flow from its limit value becomes twice 
as small. 
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The fact that it is the-constant e that appears as the base of the 
exponential function in formula (24) is one of the main reasons 
which account for the important role of the constant in mathematics 
and its applications. 

6. Integrating Exact Differential Equations. A differential equation 
of the first order is often written in the symmetric form 


P (a, y) dz + Q (z, y) dy = 0 (28) 


in place of form (13). Here P (z, y) and Q (x, y) are given functions, 
and it is the functional relationship between x and y that is con- 
sidered to be unknown. We can easily pass from one form to another. 


Fig. 286 Fig. 287 


For instance, to transform equation (28) to form (13) we must divide 
both sides of (28) by Q dz and then transpose T to the right-hand 


side. Form (28) is preferable in those cases when the variables x 
and y are regarded as being equivalent, that is when we do not 
set beforehand which of the variables z and y is an argument and 
which is a function. j. i 

There is a special case here when the left-hand side of equation 
(28) is the total (exact) differential of a function, that is 


P dx + Q dy = du (z, y) (29) 
Then we can easily integrate the equation. Actually, in this case 


the equation can be rewritten as du = 0 and hence, integrating, 
we get the general solution 
u (a, y) =C (30) 
where C is an arbitrary constant, as usual. Ha 
At the end of Sec. XIV.24 we obtained a condition guaranteeing 
the existence of such a function u (z, y), namely, the condition 
aP __ ôQ 
u On (31) 
This condition is necessary and sufficient for the left-hand side 
of equation (28) to be an exact differential. The function u (x, y) 
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is found according to formula (XIV.97) in which, of course, we must 
drop the last summand under the integral sign in our case. As was 
pointed out in Sec. XIV.24, generally we can arrive at a multiple- 
valued function u if the domain under consideration is multiply- 
connected. But even in this case formula (30) expresses the general 
solution of equation (28). 

As an example, let us take the equation 


(z? + 2ay) de + (z? — 4°) dy = 0 (82) 
Here we have 


ôP ð (12+ 2xy) AQ ð (x2 — y3) 

a A and RA = On 
Thus, condition (31) is fulfilled. In order to apply formula (XIV.97) 
to constructing the function uw we choose the point Mao at the origin 
of coordinates, for definiteness. We also choose the path connecting 
Mao with the variable point M (z, y) in the way shown in Fig. 287. 
Consequently, we obtain 


u(z, y)= f [(a? + 2zy) dz + (x? — y’) dy] = 


MoM 
= | [e+ 2ey) dr + (°— y’) dyl + 
MoM’ 
+ f [G24 2ay) de+ (2*—y") dyl (83) 
M'M 


We must put y = 0 and dy = 0 in the first integral and regard x 
and dr as z = const and dx = 0 in the second integral (why?). 
From this we receive 


x y : 
ulz, =| edet\(@—pjy=F+ey—F 89 
0 0 
Hence, the general solution of equation (32) has the form 


8 4 
orn i ae ed 
Let us similarly treat the equation 


y dx x dy 

See oe ee 
By the way, the equation can be easily integrated in a direct way 
because the variables in the equation are separable. The equation 
makes sense for all values of z and y except z = 0, y = 0 since 
both coefficients P and Q are discontinuous at the point (0, 0). 
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We must therefore take the z, y-plane with the origin of coordinates: 
punctured. A plane with one point punctured is not simply-con- 
nected. As it was shown in Sec. XIV.24, it is doubly-connected. 

Condition (34) is also fulfilled for equation (35) (check it up!). 
In constructing the function u in accordance with formula (XIV.97), 
we can choose the point Mo anywhere except, of course, the origin 
of coordinates. For instance, let us take the point Mo (1, 0). We 
first consider the case z > 0. Performing calculations analogous to 


(33) and (34) we find u = arc tan 2 (check up the result!). The- 
same function u = arc tan + satisfies relation (29) for z < 0 too. 


But if we consider the function u = arc tan Z to be defined over 


the whole z, y-plane (with the point (0, 0) removed) it will be- 
discontinuous on the straight line = 0. To get rid of the discon- 
tinuity we can put 


u= Are tan += 


where ¢ is the polar angle of the point (z, y). This function is not. 
single-valued. Even if we take a certain point M #0 and choose: 
a certain value of @ for the point, the argument 9 will gain the- 
increment 2x after M traces a closed path round the origin. But 
nevertheless the general solution of equation (35) is of the form. 


Arc tan + =C, i.e. t= tanC=C; and y=Cyx 


where C, is an arbitrary constant. From the geometric point of view 
we have obtained the totality of all the possible straight lines passing- 
through the origin of coordinates. - 

It may happen that condition (31) does not hold for equation (28). 
Then the left-hand side of such an equation is not a total differen- 
tial. It gan be shown that in this case there always exists a factor 
such that the equation becomes exact after being multiplied by 
the factor. Then the equation can be integrated provided the factor: 
is known. For example, the left-hand side of the equation —y dz + 
+ x dy = 0 does not satisfy condition (31). But if we multiply the- 


equation by the factor ———; the equation is transformed to 


form (35), which satisfies condition (31). Generally, such a factor 
is called the integrating factor of equation (28). There are no general’ 
techniques for finding an integrating factor but it can be found 
for certain specific classes of differential equations. Besides, the- 
concept of an integrating factor is used in some theoretical inve- 
stigations. 
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7. Singular Points and Singular Solutions. There are some cases 
when a first-order equation written in the form 


y' =f (z, y) (36) 
[see equation (13)]Zor in the form 
P (x, y) dx +Q (a, y) dy =0 (37) 


[see equation (28)] possesses more than one integral curve passing 
through some point in the z, y-plane or has no such curves. The 
points of this kind are called singular points of the equation in 
question. They can either be isolated or form entire singular curves. 

We begin our investigation of equation (36) with a simple special 
example, namely, with the equation 


y'= (a >0) (38) 


Let us regard y (x) as being non-negative (y > 0). Equation (38) 
can be easily integrated: 


URU TE E E for a=41 and ln y= Gi =1 39) 
e ETE nd Iny=x—C for a= (dt 
Here we have written —C instead of C for the convenience of our 
further considerations. The sign in front of C does not matter because 
C itself can have any sign. Let us distinguish between the following 
two cases. 

1. a >41. Then solution (39) can be put down as 


oe! 4 4 const 
US eS i oe 
=1 


(eaten o(Ostayen aut, (C— a)" 


which implies that if z varies from —oo to C then y increases from 
zero to infinity. The graph is shifted along the z-axis as the constant 
C is changed. The family of integral curves thus obtained is depicted 
in Fig. 288. The z-axis itself is an integral curve. It can bè obtained 
as the limit when C — oo. We see that in this case, for each point 
of the upper half-plane, there is one and only one integral curve 
passing through the point. Our example also shows that the solution 
y (x) of an equation may not be defined over the whole z-axis and 
may exist only for a certain part of the axis. Indeed, in this example 
y (x) exists only on the interval —o < z < C. k 

In the case œ = 1 we also get a similar uniqueness (check up this 
assertion!). 

2. 0< œ< 1. Then we can write solution (39) in the form 


4 1 1 
y= (1 — a) 4 (2—C) *-* = const (1—0) ** (40) 
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from which it follows that if x varies from C to oo the variable y 
increases from zero to infinity. The corresponding family of integral 
curves is shown in Fig. 289. The z-axis is an integral curve again, 
which is evident from equation (38), but now it cannot be obtained 
from formula (40) for any value of C. In this case, for each point 
of the z-axis, there are two distinct integral curves passing through 
the point, namely, the z-axis itself and the corresponding curve 
defined by formula (40). Recall that the problem of constructing 


y 


Pee: 4 


Fig. 288 Fig. 289 


an integral curve passing through a given point of the plane is 
called the Cauchy problem. Thus, in our case the uniqueness of 
solution of the Cauchy problem is violated. 

We see that in the second case the points belonging to the z-axis 
become singular points. To find out what is the cause of this pheno- 
menon we compute the derivative of the right-hand side of equation 
(38) at these points, that is for the values y = 0. Calculating we 
obtain 


(229) =(ay*')y0=0 for a>, 


=0 
(2° ) = for a=1 


and 


(2) =% for a<1 
oy y=0 
Therefore, as a point (x, y) approaches the z-axis in the case 0 < 
<a< 1, the direction of the field changes so fast that every inte- 
gral curve approaches the axis at a finite point but not at infinity, 
as we had in Fig. 288. 

We see that in the case under consideration the conditions of 
Cauchy’s theorem on existence and uniqueness of the solution are 
not fulfilled because they include the requirement that the deriva- 


tive ki should be finite. In other cases we can also obtain more than 
y 


33—0141 
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one integral curve passing through a point Mo if the value of the 
derivative (4), is infinite at the point Mo although this is not 
a necessary consequence of the fact. 


In particular, if 5 approaches infinity on a curve (Z) and if the 


curve itself is an integral curve then, as a rule, besides (L), there 
is at least one more integral curve passing through each point of (L). 
In this case we say that (Z) is a singular integral curve which means 
that (Z) is an integral curve whose all points are singular points. 
The corresponding solution for which a singular integral curve 
serves as its graph is called a singular solution. A singular solution 
does not usually enter into the general solution, that is, as a rule, 
it cannot be obtained from the general solution for any value of the 
arbitrary constant. For instance, the c-axis is a singular integral 
curve and the function y = 0 is the corresponding singular solution 
of equation (38) in the case 0< g< 1 (why?). 

There is another approach to the notion of a singular solution. 
Fig. 289 shows that the z-axis is the envelope of the family of inte- 
gral curves (see Sec. XII.5) in the case when 0<a< 1 for equa- 
tion (38). 

In the general case the envelope of a family of integral curves 
is also an integral curve because its tangent coincides with the 
direction of the field at every point provided such an envelope exists. 
At the same time such an envelope is a singular integral curve since 
there are other integral curves passing through the points of the 
envelope. This leads to a method of finding singular solutions based 
on the consideration given in Sec. XII.5. Suppose we have managed 
to obtain the general solution in the form ® (z, y, C) = 0. Then, 
by Sec. XII.5, we can find a singular solution if we eliminate C 
from the equations 


O(c, y, C)=0 and O¢(z, y, C) =9 (41) 


[let the reader perform the calculations for solution (40)l. 

We now proceed to investigate equation (37). For the sake of sim- 
plicity, let us suppose that the functions P and Q are continuous 
and that their derivatives of the first order are finite. Equation (37) 
can be rewritten in the form 


dy AR @y) op Z= De se n (42) 


dc OI) 


We can therefore apply above-mentioned Cauchy’s theorem concer 
ning equation (36). Thus, if Q (xo, Yo) #0 or P (zo, yo) #0 at 
a point Mo (xo, yo) then there exists a unique integral curve passing 
through the point Mo (£o, yo). [To apply Cauchy’s theorem it is 
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sufficient to denote one of the right-hand sides (42) which has a non- 
zero denominator as f (a, y).] But if 

P? (xo, Yo) = 0, Q (zo, Yo) =0 (43) 
then equation (37) no longer defines a certain relationship between 
dx and dy at the point Mo (xo, yo) and therefore the direction field 
turns out to be undetermined at the point. Hence, the singular 
points of equation (37) are defined by relationship (43). 


= AIR WWE 


(d) (e) 
Fig. 290 
Singular points of differential equations 
a) Nodal point: ydx —xdy=0, y= Cx 
(b) Nodal point: 2ydx—xdy= 0, y= Cx* 
(c) Saddle point: y dx + xdy = 0, xy =C 
(d) Centre: xdx + ydy = 0, 2 +y =C 
(e) Focal point: (x + y) dx — (x — y) dy = 0, p= Ce? 
(in polar coordinates) 


In Fig. 290 we give some examples of the most widely encountered 
types of singular point together with their names. (Let the reader 
verify the solutions written in Fig. 290 and the forms of their graphs 
which are also shown there.) The origin of coordinates is the only 
singular point in all these examples. In examples (a), (b) and (e) 
there are infinitely many integral curves passing through the singular 
point; there are two such curves in example (c) whereas there are 
no integral curves passing through the point in example (d). It 


33* 
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should be noted that the coordinate axes themselves are integral 
curves in examples (a), (b) and (c). To integrate equation (e) 
in Fig. 290 it is convenient to transform the equation to polar coor- 
dinates. 

8. Equations Not Solved for the Derivative. The equation 


F (z, Y, y’) =0 (44) 


differs from equation (13) investigated in Sec. 3 because in this 
case y’ is an implicit function of z and y. A characteristic feature 
of implicit functions is that ge- 

y nerally they are many-valued 


or (see Sec. 1.20). Therefore, if we 
ae 
> 


L> solve equation (44) for y’ (which 
LT DP is theoretically possible but may 
TILA [X7 be difficult to realize practically) 

SL we shall get several solutions in 
the general case: 


x y = fi (£, y), y' = f (2, Y), ams 
wey Y" = fr (2, Y) (45) 


Each of the solutions satisfies 
equation (44). 

Each of equations (45) defines 
its own direction field in the plane 
and generates a family of integral curves which cover the plane 
(see Sec. 3). Therefore if in a certain part of the x, y-plane equation 
(44) possesses k solutions with respect to y’ we have the superposi- 
tion of these k direction fields. Hence, there are k integral curves 
passing through each point of the part of the plane, that is an initial 
condition y (£o) = yo defines k solutions (see Fig. 291 where we 
have taken k = 3). 

In Sec. 7 we considered singular points, singular curves and sin- 
gular integral curves of equation (36), and equation (44) may like- 
wise have them. Equation (44) suggests that 

oy’ Fy @ yy’) 

oy Fy Guy’) 
(see Sec. IX.13) and therefore, by Cauchy’s theorem (see Sec. 3), 
such points and curves may appear only if equation (44) and the 
equation 


Fig. 294 


Fi (z, y, y') = 9 (46) 


hold simultaneously (of course, if F 4 œ). 
In particular, a singular solution (provided it exists) whose graph 
is the envelope of a family of integral curves can be obtained by 
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eliminating y’ from (44) and (46). There is another method of con- 
structing a singular solution based on formulas (41). 

9. Method of Integration by Means of Differentiation. There are 
certain cases when equation (44) can be integrated if it is differenti- 
ated beforehand. For instance, let us take the equation 


xz = f (y’) (47) 
Such equations are usually written in the form 
z = f (p) (48) 


where p = y’. 
Differentiating both sides we derive 


dx = f’ (p) dp 
By means of the last equality and formula 2 = p we find the follow- 
ing expression of dy: 
dy = p dx = pf’ (p) dp 
This implies 
y= f pf’ (p)dp+-C€ (49) 


Equalities (48) and (49) simultaneously define a functional rela- 
tion between x and y which is expressed parametrically (see 
Sec. II.6), the variable p being the parameter. Thus we have obtai- 
ned the general solution of equation (47) in a parametric form. An 
equation of the form y =f (y’) can be solved in a similar way 
(verify it!). 

The so-called Lagrange’s equation of the form 


y=fi)c+gly’), ie y=fir)et+er) (@=y') (50) 


is a little more complicated than (47). The equation is linear in the 
variables z and y but it is non-linear in the sense of the definition 
given in Sec. 4. After the equation has been differentiated we get 


dy = p dx = f' (p) dp z +f (p) dx + g' (p) dp 
that is 


[p—f (ol F=f (P) e+g (p) 


If f (p)  p we can divide both sides of the equation by p — f (p) 
and thus obtain a linear equation [see equation (19)] in which x 
is regarded as a function of p. After the last equation has been inte- 
grated we obtain a relationship of the form z = x (p, C). This 
relation together with relation (50) defines the general solution 
of the original equation in parametric form. 

In a special case when f (p) = p equation (50) is called Clairaut’s 
equation after the French mathematician A. Clairaut (1713-1765) 
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who was the first to investigate the equation in 1734. It has the form 
= ay’ + gly’), ie. 
y=ap+e(p) (p=y’) (51) 


The differentiation yields pdr = p dx + z dp + g' (p) dp, and 
thus we have 


dp [x + g’ (p) = 0 (52) - 
Equating the first factor to zero we obtain, by (51), 
PCy iey =x + g(C) (53) 


This is the general solution of equation (54). 


Equating to zero the second factor entering into the left-hand 
- side of (52) we deduce 


z= —g' (p) y=ap +g (p) = —pg' (p) + g (P) (54) 


Thus we have obtained one more solution, a singular solution, which 

is not contained in the general so- 

y lution and is represented paramet- 

rically. Geometrically, formula (53) 

defines a family of straight lines 

(why?). Formula (54) defines the 

envelope of the family. [Verify the 

last assertion on the basis of equa- 
tion (53).] 

For instance, the equation y = 
= zy’ — y’? has the general solu- 
tion 

y = Cr — C (55) 


and the singular solution whose 
; graph is the envelope of the 
Fig. 292 family of straight lines (55). To 
find the envelope we differentiate 

both sides of (55) with respect to. C which yields 


0=x2-—2C 
Eliminating C from the last two formulas we get C =~ This 


results in 
Ae oe ek f Bip a8 
ees (=) ards 


The corresponding integral curves are depicted in Fig. 292. 
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§ 3. Higher-Order Equations and Systems of Differential 
Equations 


10. Higher-Order Differential Equations. Some general notions 
related to such equations were given in Sec. 2 [namely, equation (5), 
general solution (8) or (9) and initial conditions (11)]. By the way, 
usually it is easier to investigate an nth-order equation when it 
is given in the form solved for the highest derivative: 


y® = f (2, yr y's e YC) 


In particular, Cauchy’s theorem is immediately extended to such 
an equation (see Sec. 3): if the function f is continuous and has 
finite partial derivatives of the first order with respect to y, y's --- 
..., y®—0 at a given initial point defined by conditions (11) then 
‘there exists a solution of the equation satisfying the initial con- 
ditions and the solution is unique. 

We now consider some particular types of higher-order equations 
that are integrable by quadratures. For higher-order equations such 
cases are still rarer than for the first-order equations. Here we shall 
consider non-linear equations (linear equations will be treated in 
detail in § 4). The main method for formal integration of non-linear 
higher-order differential equations is the method of reducing the 
order. It enables us to pass to an equation of a lower order which is 
equivalent to the original equation. As a rule, the lower the order 
of an equation, the easier the integration. Besides, we can sometimes 
come to a first-order equation belonging to one of the integrable 
types (see Sec. 4) after the order of the original equation is reduced 
several times. In such a case we are able to complete the integration. 
Here we shall consider certain particular methods of reducing the 
order. Some other techniques can be found in [24]. 

4. For example, let us take the equation of the second order 


y? + yy" = 0 
To integrate it observe that its left-hand side can be rewritten 
in the form 
y? + yy" = (w'y 
whence (yy’)’ =0, yy = C1, y dy = C, da and 
2 
+ =C,rz+C, 


(which is the general solution). 
Further, an equation of the form 


yi — yy" —0 
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can be easily integrated in a similar way if we divide both sides 
by y? beforehand. Indeed, 
yyy" aN Uea Ai _¢ 
rey are (=) =o 7 =C and 7 =C dz 
which results in the general solution 
mn iyl Ger mn Ca y= Ceo 


Here, after the division by y?, we have obtained a so-called inte- 
grable combination, that is an expression whose “exact derivative” 
is equated to zero. Such a method is sometimes applicable to other 
equations. 

For the sake of simplicity, we shall further consider the cases 
when the order can be reduced for an equation of the second order 
of the form 

F (z, y, y’, y") =0 (56) 


2. Let an equation of form (56) not contain y. Let it contain 
only the derivatives of y and the argument. This is an equation 
of the type 

F(z, y’, y") = 0 (57) 
Introducing the notation y’ = p = p (z) we obtain the equation 
F (x, p, p')=0 
which is implied by (57). Thus, we have got a first-order equation. 
Suppose we have managed to integrate this equation and have 
obtained its general solution p = ọ (x, C,). Then we have y’ = 


= @ (z, Cı) and therefore the general solution of equation (57) 
is thus obtained: 


y= | o (z, C) dz+ C, 
3. Let equation (56) not contain z, i.e. let it be of the form 
F (y, y’, y") =0 (58) 


Then we again put y’ = p but regard p as being a function of y. 
It should be noted that here we cannot simply substitute the expres- 
sion y" = p’ for the derivative y” into (58) because p’ is the deri- 
vative of p with respect to x but not to the new argument y. There- 
fore we write 

y= 2) _ a _ ap dy pip 

"de dc dy az” dy 

Now we deduce from equation (58) the equation 
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which is a first-order equation. If we manage to integrate it and to 


find its general solution p = @ (y, C;) then we have Ld =@ (y, Ca) 
and therefore the general solution of equation (58) can be directly 
written in the form 


dy S 
f @ (ys C1) =a, 


4. Let the left-hand side of equation (56) be a homogeneous func- 
tion in the variables y, y’ and y” (see Sec. 1X.12): 

F (z, ty, ty', ty") = CF (œ, y, yy’) (59) 

In this case we can reduce the order by means of the substitution 


y' 
Sh Uu 


Indeed, the substitution results in 
y = uy, y'= uy + uy’ = u'y + uuy = w+ u’) y, 
F (a, y, yu, yu’ + w) = 0 
and thus F (z, 4, u, u’ + u’) = 0. While writing the last expres- 
sion we have used property (59). If we manage to integrate the 
first-order equation thus obtained we have 


u= 9 (z, Ci), La» (x, Ci) 


From this we deduce the general solution of the original equation: 


Injy|=} @(@ Ci) de+in C» ie. pete ar 


{1. Connection Between Higher-Order Equations and Systems 
of First-Order Equations. A higher-order equation of form (5) can 
tem of n equations of the first order con- 


always be reduced to a sys i € ) 
taining n unknown functions. To do this it is sufficient to introduce 


the notation 


fay i epee. A E a (60) 
Then, by equation (5), we can write 
yi = Yo 
Ya = Ys 
E AOA eG VEE Ee ai « (64) 
Uni Un 
F (a, Yis Yar ss Yn: Yn) =0 


System (61) is of a particular form. The general form of a first- 
order system (for the case of three equations containing three un- 
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known functions) is written as 


Fy (£, Yi, Yar Yar Yar Yor Ys) = O 
P, (z; Yi: Yor Ys, Yis Yo Ys) =0 (62) 
Ps (2, Y1s Yor Yar Yar Yar Ys) =0 
The general system of n equations in n unknown functions has a si- 
milar form. 
Conversely, a system of form (62) can be transformed to one equa- 
tion (in this case to a third-order equation and in the general case 
to an nth-order equation) in one unknown function (for example, 


the function y,). Therefore the general solution of system (62) con- 
tains three arbitrary constants: 


Yi = Qı (x, Ci, C2, C3), Yo = Qo (x, Ci, Co, Cs), 
Ys = Pa (x, Cy, Cz, Ca) 


In order to deduce an equation of form (5) (with n = 3) from 
system (62) we should differentiate each of the equations twice. 
Then together with equalities (62) we obtain nine relations from 
which we can eliminate the eight quantities ys, Y3, Yi. Yj. Yoo Ys Ya» 
-and y%. This yields the sought-for third-order equation. After the 
equation has been integrated we obtain the general solution y; = 
= pı (x, Ci, Cz, C3). To complete the process of solving the system 
we must find ys and ys. But this can be performed without integra- 
tion since y> and y, are expressed in terms of y; and its derivatives 
by means of the above-mentioned relations. 

A general system of differential equations of arbitrary orders 
can be transformed into a system of first-order equations by intro- 
ducing notation similar to (60). For example, suppose we have 
a system of two equations. Let the equations be of the third order 
with respect to one of the unknown functions and of the second 
order with respect to the other. Then such a system is equivalent 
to a system of five first-order equations containing five unknown 
functions. The general solution of the original equation in this 
example must contain five arbitrary constants. 

12. Geometric Interpretation of System of First-Order Equations. 
For simplicity’s sake, let us take the case of a system of two first- 
order equations in two unknown functions y; (z) and yə (2): 


Fi (z; Yi, Yo: V va) =0 | (63) 
F, (x, Yi, Yor Yay Ya) =0 
If it is possible to solve the system for y; and y, before performing 
integration we obtain a system of the form 


Ys = fi (£, Ys yo) | (64) 


Ya = fa (2, Y Yo) 
Then we say that the system is written in the normal form. 
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A solution of (63), or of (64), is, of course, a pair of functions 
yı = y (2) and yz = Ya (z) (65) 


which converts both equations into identities. According to Sec. 11, 
the general solution of the system contains two arbitrary constants, 
that is it has the form 


yı = yi (£; C1, Co), Ya = YR (x, C1, C2) 


System of equations (64) and its solution (65) can be simply 
interpreted geometrically if we introduce a three-dimensional 
Cartesian space with the coordinates x, y; and yz and regard it as 
a usual geometric space. Then 
formulas (65) yield the parame- Mp T 
tric representation of a curve (see Y 
Sec. VII.23) if we consider x to 
be a parameter (we can formally 
write an additional equation of 
the form z = z). This curve is 
called an integral curve of system 
of equations (64). Let us take an S 
arbitrary point M (£, Y1 y2) in 
the space (see Fig. 293) and cal- 
culate the values of the right- 
hand sides of system (64) at the /y, 
point. These values being equal 
to y; and yj, we thus determine 
the directions of the tangent 


lines to the curves ys = Yı @ 
and y, = Ye (x), ie. fe the projections of the integral curve. The- 


refore we can determine the direction of the tangent to the 
integral curve at the point M provided it passes through M. 
Point M being an arbitrary point, we see that system (64) defines 
a direction field in the 2, Y1» Yorspace. Hence, an integral curve 
is a curve whose tangent goes along the direction of the field at 
each point (compare with Sec. 3). : j k 

The variables y, and yz are involved equivalently in system (64) 
whereas the variable a plays 4 specific role. But there are such cases 


when all the three variables are equivalent so that each of them 


can be taken as an independent variable. It is preferable to write 
a system of this kind in the so-called symmetric form 
[Bs dy ae dz 66 
Pera Cen) REA (66) 
It is easy to obtain form (64) from form (66) and vice versa (how 
can we do this?). 


Fig. 293 
(L) is, an integral curve 
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The geometric interpretation of system (66) is analogous to the 
above interpretation of system (64). By relations (66), the vector 
dr = dzi + dyj + dzk must be parallel to the known vector Pi + 
+ Qj + Rk at each given point M (xz, y, z) (why?). The problem 
of integrating system (66) is therefore reduced to the problem of 
constructing curves in space which have prescribed directions of 
their tangents at each point. 

The geometric meaning of system (64) implies that in order to 
isolate a unique integral curve we must specify a point Mo (£o, Yo, “o) 
in space through which the curve should pass. In other words, the 
initial condition 

Ys (£0) = Yio Ya (20) = Ya,0 


uniquely defines a certain solution of system (64). As in Sec. 7, 
in the case of a system of equations we can also encounter singular 
points and singular curves. These points and curves can be distin- 
guished in a manner similar to the one described in Sec. 7. In par- 
ticular, a singular point of system (66) is a point at which all the 
three denominators vanish. Every point of this kind is a singular 
point; the vector Pi + Qj + Rk turns into zero at such a point 
and thus it has no certain direction at the point. 

A normal system of first-order equations with an arbitrary number 
of unknown functions has the general form 


yi = fi (x, Yis Yor «+ ey Yn) 
Va = fa Yay Yor «ss Yn) (67) 


Yn = fa (2, Yas Yas «+ +s Yn) 
A solution of the system is a set of functions of the form 
H =y (e), Y2 = Ya (8) -o Yn = Yn (2) (68) 


which reduces the system to an identity. The general solution of 
such a system contains n arbitrary constants. 


; To specify a unique solution we can set initial conditions of the 
orm 


Yı (£o) = Yio Y2 (zo) = Y2,0. +++) Yn (Lo) = Yno (69) 


Cauchy proved that there is a solution of system (67) satisfying 
conditions (69) and that the solution is unique if the right-hand 
sides of system (67) are continuous and possess finite partial deri- 
vatives of the first order with respect to the variables y4, Yo, - ++» Yn 
for the values z = zo, y; = Yio, ©.. Yn = Yn o: 

_The geometric meaning of system (67), its solution (68) and con- 
ditions (69) is that the system defines a direction field in an (n + 1)- 
dimensional space with the coordinates z, Yin eae Yn d that 
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the solution of the system satisfying conditions (69) defines an 
integral curve lying in the space and passing through the point 
Mo (£o, Yio» -+ +» Yn,o) determined by (69) (see Sec. VII.18). 

In case the right-hand sides of system (67) do not depend on the 
variable x the system is called autonomous. It turns out that it 
is more convenient to regard its solutions as curves lying in the 
n-dimensional space of the variables y4, Yo: -- ++ Yn which is a sub- 
space of the £, Y1, -- +, Yn-space, the variable x playing the role 
of a parameter. In such a case the space of variables yy, . ++; Yn 
is called the phase space of the system and the curve y, Gean 

.., Yn (z) is called a phase trajectory of the system. A point 
(Yi, - ++; Yn) of the n-dimensional space is called a state of the 
system and thus the space yy, ---, Yn may be called a state space. 
For simplicity’s sake, let us limit our attention to the case n = 2. 
We shall denote the independent variable by ¢ and interpret it as 
time. The unknown functions y; and yz will be designated by the 
letters z and y, ie. z =a (t) and y = y (4). Then, in place of system 
(67), we get a system of the form 


#_p(a,y), $= 


Multiplying the first equation by i and the second equation by j 
and adding together (in the vectorial sense) the results we arrive 
at the vector differential equation 


ZA y) of G=AW) (70) 


where A = P (z, yji +Q (z, yi is a given vector feld in 
the phase plane z, y. The derivative A being the velocity (see 


Sec. VII.23), we see that there is a velocity field defined in the 
zx, y-plane. Each solution r (4 = <2 (i+ y(t) j defines the law 
of motion of a point in the plane, and the point has a prescribed 
velocity at each of its positions. We can imagine that equation (70) 
defines a flow of liquid in the phase plane and that to solutions of 
the equation there correspond laws of motion of particles of the 
liquid (but in fact the equations of motion of a liquid medium are 
much more complicated; such equations are deduced in hydrody- 
namics and they prove to be partial differential equations). The 
autonomy of equation (70) implies that the “flow” is stationary and 
therefore any two distinct trajectories have no common points. 

For example, let us write equation (4) in the form of an autono- 
mous system of first-order equations: 

d dv 

where y is the coordinate of the oscillating point and v is its velocity. 
In courses on physics one can find the following formula of the 
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total energy of an oscillating point: 


Mv? ky? (72) 


(let the reader think about the structure of the formula). In the 
process of free oscillations without friction the energy must be 
preserved. Indeed, computing the derivative of Æ with respect to ¢ 
on the basis of formula (71) we obtain 


dE d d 
= MoS thy = —kyv-+ kyv=0 


This is a mathematical proof of the law of conservation of energy 
i for our example. Hence, we 
vZ have Æ = const for every 
M solution of system (71). The- 
refore we see that a point 
representing the state of the 
system in the phase plane 
y describes an ellipse in the y, 
We v-plane. Different ellipses cor- 
k respond to different possible 
oscillations of the material 
point, and to different ellipses 
there correspond oscillations 
about the equilibrium posi- 
Fig. 294 tion with different amplitu- 

i ; des (see Fig. 294). 
13. First Integrals. For definiteness, we now consider a system 
of three first-order equations of form (62). Every relation of the form 


D (z, Yis Yo Yg, C) =0 (® Æ 0) (73) 


which is identically satisfied by any solution of the system is called 
a first integral of the system of equations. Here C is a constant which, 
in general, is different for different solutions. 
4 Knowing a first integral we can reduce the number of equations 
in the system by unity. Indeed, if we express y in terms of the rest 
by means of relation (73) and then substitute the result into the 
first two equations (62) we obtain a system of two first-order equ- 
ations containing two unknown functions y, and yz. If we integrate 
the last system, i.e. if we find y, (x) and yz (x), then ys (x) can be 
found without integration on the basis of equality (73). 
Similarly, if we know two independent first integrals we can 
reduce the number of equations by two. Finally, if we manage to 
find three independent first integrals (that is such integrals that 


DIFFERENTIAL EQUATIONS 527 


none of them is a consequence of the others) of the form 


®; (z, Yis Yo: Y3: Cı) Te 0 
De (£, Yi, Yor Ys C) = 0 
M3 (2, Yi, Yz Ys, Cs) = 0 


then we obtain the general solution of system (62) represented in 
an implicit form. y 

It is sometimes possible to find first integrals by deducing so- 
called integrable combinations from given equations. For instance, 
we can easily derive such a combination for the system 


Pa +2 
y mute) a9 


Indeed, we see that 


yy +z = y (y +2) +z ly ta) yH 
and therefore 
4 z d (y2 2 
Lgppey=pte, M = 2de and 
In (y + 2) = 224 Inc 


Finally, we obtain the first integral 
yen = Ce 


It follows that if z — oo then every solution approaches infinity, 
and if z- —oo it tends to zero. As a rule, in other cases we can 
also draw some important conclusions concerning the behaviour 
of solutions without completing integration of a system in question 
when we know one or more first. integrals of the system. Returning 
to system (74) we see that we can receive another first integral if 
we divide one of the equations by the other (let the reader perform 
the calculations!). 

There are some cases when first integrals can be derived on the 
basis of certain physical considerations, more often by different. 
conservation laws. 

For instance, relation (72) is a first integral of system (71) because 
we can regard Æ as an arbitrary constant C. Expressing v in terms 
of y from the relation and substituting the result into the first equ- 
ation (71) we can easily complete the integration (let the reader 
do this!). 

In Ae we note that, as it has been seen, it is more natural 
to consider systems of equations in which the number of equations 
coincides with the number of unknown functions. Such systems are 
called closed. If the number of unknown functions exceeds the 
number of equations the system is called non-closed (or sub-definite). 
The excessive number of unknown functions can be chosen arbi- 
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trarily. The fact that a system is non-closed usually indicates that 
some necessary relations were not taken into account. If the number 
of equations is greater than the number of unknown functions the 
system is called overdetermined. Such a system is usually contra- 
dictory, that is it has no solutions. An overdetermined system can 
also indicate that there is an interdependence between the equations 
entering into the system in question. This means that some of the 
equations are consequences of the rest and therefore they are unne- 
cessary and can be dropped. Such a situation may also show that 
when deducing the equations we made a mistake. 


§ 4. Linear Equations of General Form 


14. Homogeneous Linear Equations. The methods of investigating 
linear equations of arbitrary orders have many features similar 
to those of the methods of solving first-order linear equations (see 
Sec. 4). But in the general case it is no longer possible to integrate 
an equation by quadratures. For the sake of simplicity, we first 
consider second-order equations. The equation 


2" + plaz +9 (x)2=0 (75) 


whose left-hand side is a linear function in the unknown function 
and its derivatives is called a homogeneous linear equation. 

For brevity, let us denote the left-hand side of equation (75) as 
L [z], i.e. in this case, by definition, 


L [2] = 2" + p (2) z’ + q (2) z 
Then equation (75) can be rewritten in the form 
L [z] =0 
The expression L [z] possesses the following properties: 


L la, + zal = (es + 22)" + p (2) Gi + 20)" + g (a) Ci + 22) = 
= (%4 + p (a) 3 + g (2) z) + @ + p (2) % + g (2) 2a) = 
= LILL], EICI = cL [zl 


where C = const (the last property is verified similarly). 
Expressions of this type are called linear operators; we mentioned 
them in Sec. XIV.26. 
We can easily prove the following properties of equation (75): 
1. A sum of solutions of equation (75) is a solution of the same 
aang Actually, if 2; and z, are two solutions of equation (75) 
en 


LlzJ=0 and Liz] =0 
and thus 
L lz, + z = L [a] + L [22] = 0 
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2. If we multiply a solution of equation (75) by a constant we 
get a solution of the same equation. The verification of the second 
property is analogous to that of the first property. 

Now we can combine properties 1 and 2 in the following way: 
a linear combination of solutions of equation (75) (see Sec. VII.5) 
is a solution of the same equation. For instance, if z; (z) and Z (z) 
satisfy equation (75) then 


z = Cz; (£) + C2 (x) (76) 


also satisfies the same equation for any values of the constants 
C, and Cz. 

3. A function which is identically equal to zero always satisfies 
equation (75). 

4. If we know a nonzero solution of equation (75) we can reduce 
by unity the order of the equation. Virtually, let z, (x) be such 
a solution. Make a substitute of the form z = 2,u where u = u (x) 
is a new unknown function. Then we get 


(atu + 2zu' + zu") + p (zu + zu) + gu = 0 
that is 
zu” + (227 + pz) wu’ + (z1 + pz, + 9%) u = 0 


But since L [z,] = 0, the last term vanishes and hence, after the 
substitution u’ = v, we finally receive 


av’ + (22, + pz) v =0 


which is a first-order linear equation, that is an equation whose 
order is less by unity than the order of the original equation. 
Now we complete the integration: 


Ae EP dey In jo |= —21n|z,|— | p (0) de+InCy, 
4 


v 
Co -~ \p(x)dx 4 -\p(x)dx 
piatan fpioas u=C: | fed ropa 
and hence 
z= Ciz + Cot f fe fro dx (77) 


The function in front of which the factor Cs is placed is one of the 
particular solutions of equation (75) because we can obtain it from 
general solution (77) by putting C, = 0 and C, = 1. Therefore, 
denoting this solution by 2, we arrive at the fifth property. 

5. The general solution of equation (75) has. form (76) where C, 
and C, are arbitrary constants and 2; and 2, are two particular solu- 
tions of the equation. 


34—0144 
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It should be noted that formula (76) in fact expresses the general 
solution of equation (75) if and only if the solutions 2, and Z are 
linearly independent. The concept of linear dependence of functions 
is similar to that of vectors (see Sec. VII.5). Namely, we say that 
several given functions are linearly dependent if one of them is a li- 
near combination of the rest. In particular, two functions Z, (x) 
and Z (zx) are linearly dependent if and only if 2» (x) = C% (2) 
or z; (x) = CZ, (x). Thus, Z, (x) and z, (x) are proportional to each 
other in this case. Hence, we see that formula (76) does not yield 
the general solution in such a case because 


Ciz + CZ: = Ciz, + C03, = (Cy + C20) 21 (x) = Dz, (x) 


where D = C, + CC is a constant. This means that although for- 
mally we have two arbitrary constants on the right-hand side of 
expression (76), these constants are not essential parameters since 
their number can be reduced by unity (see Sec. X.2). 

All the enumerated properties also hold for a homogeneous linear 
equation of any order of the form 


a) + p (2) z0 4 q (m) 20) +... +8 @)2 = 0 (78) 


with the only exception of the fifth property which must be replaced 
by the following assertion: the general solution of equation (78) 
has the form 


z = Ciz (2) + Coza (2) +- - - + Cnn (2) (79) 


instead of (76). The functions % (z), 22 (x), .. +) Zn (a) entering 
into formula (79) are any n linearly independent solutions of equation 
(78) and Cy, Cz, .--, Cn are arbitrary constants. Any family of n 
linearly independent solutions of equation (78) is called its fun- 
damental system of solutions. Thus, the general solution of equation 
(78) is a linear combination of solutions forming a fundamental system 
of solutions with n arbitrary constant coefficients. Using the termi- 
nology of Secs. VII.17-19 we can say that the totality of all the 
solutions of equation (78) is an n-dimensional linear space. A fun- 
damental system of solutions is a basis in such a space. 

In conclusion we note that the left-hand side of equation (78) 
is a homogeneous function in the waninblos: Z soy 20I, ain) 
and therefore we can reduce by unity the order of the equation with 
the help of the method of Sec. 10 (see type 4). But this procedure 
is rarely performed because after the order is reduced the equation 
becomes non-linear. 

15. Non-Homogeneous Equations. We now consider a non-homo- 
geneous linear equation of the form 


y +p@y +a@y=f@ (80) 
According to Sec. 14, let us denote the left-hand side by L [y]. 
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1. A particular solution of equation (80) being known, it is pos- 
sible to reduce the problem of integrating the equation to the pro- 
blem of integrating a homogeneous equation (75) which corresponds 
to (80) [that is an equation of form (75) having the same coefficients 
on the left-hand side as equation (80) but with zero right-hand side, 
or, simply, an equation that is obtained from (80) by dropping its 
right-hand side]. 

Indeed, if Y (z) is such a solution then after the substitution 


y =Y (z) +2 (81) 
where z = z (z) is a new unknown function we obtain 


LIY +2] = f (2) 
which suggests 


L [Y] + L [z] = f (z) 


But L [Y] =f (æ) (why is it so?). Hence we arrive at an equation 
of form (75) containing z as an unknown function. 

Thus, the general solution of non-homogeneous linear equation (80) 
is the sum of any particular solution of the equation and the general 
solution of the corresponding homogeneous linear equation (compare 
this result with the general solution of the first-order equation 
considered in Sec. 4). 

2. If the right-hand side f (x) is a linear combination of some 
functions, for instance, of two functions, that is Í (z) = can (x) + 
+ Bf. (x) (where a and f are constants), and if certain particular 
solutions Y, (x) and Y+ (z) of equation (80) having right-hand sides 
equal to fı (x) and fz (zx), respectively, are known, then the function 


Y (x) = aY, (2) + BY- (2) 


is a particular solution of equation (80) with the right-hand side 
2). 
: his simple property is in fact a special case of a general principle 
called the superposition principle (see Sec. XIV.26). The proof of 
the property is left to the reader. i ; 
3. If the general solution of homogeneous equation (75) is known 
the general solution of equation (80) can be found by quadratures. 
This is performed by means of a method discovered by Lagrange. 
It is called the method of variation of parameters (we used the 
method in Sec. 4 in solving the first-order non-homogeneous 
linear equation). As we know, the general solution of equation (75) 
is of form (76). By analogy with formula (22), we seek a solution 
of equation (80) in the form 


y = qı (2) 21 (2) + Q2 (2) 22 (2) (82) 
34% 
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where p, and Ẹz are some functions yet unknown. But there are 

two such functions here and only one equation (80). Therefore, in 

order to find the functions, we shall impose one more additional 

condition which will be put down below [condition (84)]. 
Differentiating equality (82) we obtain 


y’ = (pi, + Pa) + (p21 + P322) (83) 


We now set the condition that the expression inside the second 
parentheses on the right-hand side of relation (83) should identically 


vanish: 
ois + 9322 = 0 (84) 


Then when differentiating equality (83), we must take into account 
only the expression entering into the first parentheses, that is 


y” = (giz + 222) + (p121 + 9222) (85) 


Substituting the results (82), (83) and (85) thus obtained into equ- 
ation (80) and dropping the sum which equals zero we receive 


pa (2% + pz, + 921) + p2 (Z3 T P% + 922) + 
+ (piz + 9242) = f (2) 


(check up the calculations!). 
The functions z, and z, satisfying equation (75), the first two 
parentheses in the last equation can be deleted. Thus we arrive 


at the equality 
gizi + 2%, = f (2) (86) 


Hence, now we have two relations (84) and (86) for determining 
qı and pz. The functions 2, 2, and f (z) being regarded as known, 
we have a system of two algebraic equations of the first degree in 
two unknown quantities, i.e. in the derivatives ọ; (x) and q; (2): 
Solving the system we find the unknowns and then, integrating, 
we obtain q, and qo. 

For instance, let us take the simplest equation of forced oscilla- 
tions which is obtained if we add an external force P (t) to the right- 
hand side of equation (3). Dividing both sides of the equation by 
we derive 


y + oy =f (4) (87) 


where the notation © =% and ro=20 is introduced. 


The corresponding homogeneous equation 
2’ + oz = 0 (88) 
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has two linearly independent particular solutions of the form z4 = 
= cos Wot and Z% = Sin @ot (this fact can be directly verified by 
substituting the solutions into the equation). The general solution 
can therefore be written as 


z = C, cos Wot + Casin Wot (89) 
In particular, it follows that wo is the fundamental frequency 
(natural frequency) of the system in question, that is the frequency 
of free oscillations arising in the system when there are no external 
forces. 
According to formula (82), we seek a solution of equation (87) 
in the form 
y = qı (2) cos Wot + P2 ($) Sin Wot (90) 
Then equations (84) and (86) turn into 
pi COS Wot + p, SiN Wot = 0 } 
pi (—@o sin Wot) + P400 COS Mot = f (Ż) 
From this we immediately find d 


, 4 
pi (i) -+ f(t)sinog and 9, (Z)= a f (t) cos Wot 


Now we must integrate the last relations. It is inconvenient to 
use indefinite integrals here because they contain arbitrary constants 
which are difficult to specify. It is therefore better to put down 
the results in the form of definite integrals with a fixed lower limit 
of integration and with a variable upper limit. Let the lower limit 
be equal to zero (we assume that ¢ = 0 is the initial instant of time). 


Then we have i 


p=- E | FO sino dt+C, 
0 

where C, is an arbitrary constant. Here the variable ¢ is understood 
in two-fold sense because the letter ¢ designates the variable of 
integration and the upper limit as well. It is therefore more con- 
venient to use the fact that the value of a definite integral is inde- 
pendent of the notation of the variable of integration (see Sec. X1V.3) 
and rewrite the expression of @4 (À in the form- 


t 
oi) = E | f(@)sinogr dr +i 
0 


The expression of @2(t) is found similarly: 
i 


Lry f(t) cos wot dt + C2 


p= ) 
0 
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Substituting pọ, and p> thus found into (90) we derive 
t 
y= -5 COS Wot f f (T) sin oT dt + 
0 
1 t 
+ os sin Wot f f (T) cos mgt dt + C, cos Wot + C sin Oot 
0 
We now insert cos wot and sin wot under the integral sign. We could 
not do this if we had not changed the notation of the variable of 
integration. But now this is permissible and thus we get the expres- 
sion 
t 


y= \ [— f (T) Cos Moé sin Wot +- 
0 
+ f (T) sin œt cos Mgt] dt + C; cos Wot + C2 sin Oot 
in which both integrals are combined. From this we obtain the gene- 
ral solution of equation (87): 


t 
’== \ sin © (t — 1) f (T) dt +C; cos @pt + Cz sin Oot (91) 
ò | 


The arbitrary constants C, and C, can be determined if we set, 
for example, the initial conditions 
y (0) = yo y (0) = vo (a 


Substituting t= 0 into both sides of (91) we obtain yo = Cı. To 
use the second condition (92) we must differentiate equality (91) 
with respect to ¢ and then substitute ¢= 0. When differentiating 
the integral we must take into account that ¢ enters into the inte- 
gral as an upper limit of integration and as a parameter under the 
integral sign. Hence, using formula (XIV.80), we deduce 


t 
== f @p COS Wy (t — T) f (T) dt + 
0 


Lites 
ap ls Sin © (f — T) f (Oa — C4@o Sin Oot + C200 COS Mol = 
t 


= f COS Wp (t — T) f (T) dt — C100 Sin Mgt + C200 COS Oot 
0 
Thus we obtain 


1) = C0; i.e. Co= 
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Consequently, the particular solution of equation (87) satisfying 
the initial conditions (92) is 


t 
j= x j sin O (t—) f (T) dt + Yo COS Mot + os sin Wot 
0 


All the enumerated properties hold for the general equation 

y pp (a) yr? + a @) yr) +... Hs (a) y =F) (93) 
as well. The method of variation of arbitrary constants is applied 
to equation (93) in the following way: we substitute functions 
pa (x), Pz (2), - - - Pn (x) for Cy, Cz, -> +) Cn: respectively, into 
formula (79) and then differentiate the formula in succession up 
to the (n — 1)th derivative inclusive. After each of the differenti- 
ations we obtain a group of terms containing the derivatives }, 
Qa - + +) Pn, and we equate each of the groups to zero. Finally, 
differentiating formula (79) [with Ci, Ca; s+. Cy Teplaced iby 
qı (2), P2 (2); + + +1 Gn (z)] the nth time and substituting all the 
expressions thus obtained into equation (93) we derive the nth 
relation connecting Qis Qi ---» Ọn- Then Pis Pa + - -> Pn are 
found by solving the linear algebraic system of equations etc. 

4. Any solution of equation (18) or of equation (93) can be exten- 
ded to any interval in which the coefficients and the right-hand 
side of the equation do not approach infinity. Generally, this is 
not the case for non-linear equations because it can happen that 
a solution (or its derivatives) approaches infinity for some finite value 
of z. A simple example of this fact is equation (38) in the case >1 
(see Fig. 288). Here the slope of the direction field increases so fast, 
as y increases, that integral curves travel into infinity after passing 
only a finite distance along the z-axis. On the contrary, a solution 
of a linear equation (for instance, the solution y = Ce™* of the 
equation y’ = My) cannot approach infinity for a finite value of x. 

16. Boundary-Value Problems. In the preceding section we studied 
problems in which we isolated a particular solution from the general 
solution by means of initial conditions which define certain values 
of an unknown function and of its derivatives for a single value 
of the argument. But there are some other ways of isolating a par- 
ticular solution from the general solution which are encountered 
in practical problems. At the same time it is common for all methods 
of determining a particular solution that the number of additional 
conditions imposed on a sought-for solution must be equal to the 
number of degrees of freedom (see Sec. X.2) in the general solution 
of an equation in question, that is to the order of the equation. 

These additional conditions can be written, in the case of equa- 
tion (5) of the nth order, in the form 


Gr [y] = Qk (k alts Ae eG n) (94) 
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where G, ly] (k =1, 2, ..., n) is a given combination of the 
values of the sought-for function y (x) and its derivatives taken 
for different values of the argument, in the general case, and a, 
Oo, . ++; @, are given numbers. [More precisely, Gr ly] (k = 
= 1, 2,..., n) is a given functional, the notion of a functional 
being mentioned in Sec. XIV.4.] For instance, if we take the case 
of initial conditions (11) then Gp [y] is simply equal to y®*=1 (xo). 

If general solution (8) of a given equation is known then to find 
the particular solution we are interested in we must substitute the 
expression of the general solution into conditions (94) which results 
in a system of n equations in n unknowns Cy, Co, ..., Cn. H 


Gr (Cry, + Coys) = CiGy lyal + CoG, lyol (Ci, C2 = const) 


then conditions (94) are called linear. If, in addition, all a, = 0 
then the conditions are called homogeneous linear conditions. If 
we have functions (which may not necessarily be solutions of the 
differential equation) satisfying homogeneous linear conditions then 
any linear combination of the functions satisfies the conditions as 
well. In fact, if, for example, Gr ly] = 0 and G, lya] = 0 then 


Gr Ciya + Cayo] = CiG, Lys) + CoGr [ya] = Cr-0 + C20 = 0 


The difference of two functions satisfying the same non-homogeneous 
linear conditions satisfies the corresponding. homogeneous condition 
(that is a homogeneous linear condition with the same left-hand 
side). Let the reader verify the assertion! 

In our further discussion we shall limit ourselves to solutions 
of the equation ` 


y +p @)y +a (2) y = f (2) wrab) (95) 
with the additional conditions 
y (a) = %, y (b) = a (96) 


although all the general conclusions we shall draw remain true for 
linear equations of any order n with additional linear conditions 
(94) of arbitrary form. The interval (a, b) will be regarded as being 
finite. We shall also assume that the functions p (2), q (x) and f (x) 
are finite. Then we can regard any solution as extended over the 
whole interval including its end-points (see property 4 in Sec. 15). 
Conditions of form (96) containing only the end-point values of 
functions defined over the interval in which the solution is sought 
for are called boundary conditions. The corresponding problem of 
determining the solution of the equation is called a boundary-value 
problem. 

The solution of the above boundary-value problem is found on 
the basis of the general solution of equation (95) which is 


y (£) = Y (a) + Ciz, (£) + C222 (x) (97) 
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where Y (x) is a certain particular solution of equation (95) and 
z, and Z are two linearly independent particular solutions of the 
corresponding homogeneous equation (see property 4 in Sec. 15). 
Substituting formula (97) into condition (96) we obtain two relations 


for C, and C3: 
Cız, (a) + Coza (a) = a — Y (a) \ 
Cız, (b) + Caza (b) = a, — Y (b) 

In solving this system of two algebraic equations of the first 
degree we can encounter the following two cases (see Secs. VI.4 
and VI.6): 

1. Basic case. The determinant of the system is unequal to zero. 
Since in this case system (98) has a certain uniquely defined solution, 
equation (95) with conditions (96) possesses one and only one solu- 
tion for any right-hand member f (z) and any numbers a, dp. 

2. Singular case. The determinant of the system is equal to zero. 
In such a case system (98) is, as a rule, contradictory but it may 
have infinitely many solutions for certain specific right-hand sides. 
Hence, equation (95) with conditions (96) has, as a rule, no solu- 
tion when the function f (x) and the numbers a, œ, are chosen arbi- 
trarily but it has an infinitude of solutions when f (x), @1, Œ are 
chosen in a certain specific manner. For example, we can easily 
verify that if f (z) and œ; are given beforehand then there are infini- 
tely many solutions only for one specific value of a, the problem 
having no solutions for any other value of ap. 

It should be noted that it is the form of the left-hand sides of 
equation (95) and of conditions (96) that determines which of the 
above cases takes place. 

According to Sec. V1.6, the basic case takes place if and only if 
the corresponding homogeneous problem [in which we must put 
f (z) = 0, a = 0 and a, = 0] has only the zero solution. In the 
singular case the homogeneous problem has infinitely many solu- 
tions. If the non-homogeneous problem possesses at least one solu- 
tion then adding the general solution of the corresponding homoge- 
neous problem to the particular solution of the non-homogeneous 
problem we obtain the general solution of the non-homogeneous 
problem. 

When we deal with an initial-value problem (Cauchy’s problem) 
we always have the basic case because the solution always exists. 
and is unique. In solving a boundary-value problem we can encoun- 
ter the singular case as well. For instance, let us consider the follow-- 
ing problem containing a constant parameter à = const: 


yt ro Srs yOu, v= a. O 


Let us first take the case A > 0. Then the functions z; (£) = cos V Ar 
and za (z) = sin ) dx are two linearly independent solutions of the 


(98) 
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corresponding homogeneous differential equation and hence the 
determinant of system (98) is equal to 


a (0) z(0) : 
z(l)  22(1) cos Vil sinV Al 
Equating the determinant to zero we find the values 


1 \2 Qn \2 30 \2 

ee ey am 
for which there is a singular case for problem (99). This means that 
either the existence or the uniqueness of the solution is violated. 
The set of values of a parameter entering into the statement of 
a problem for which the problem degenerates in a certain sense 
(see Sec. II.8) is called the 
y P spectrum of the problem. We 
p> UW. suggest that the reader should 
(a) verify that in the case A <0 we 
w always have the basic case for 
B, problem (99). This means that 
FcR EL: the set of the values (100) is the 

spectrum of the problem. 
(b) The result obtained above has 
an important application to the 
problem of investigating the sta- 
: bility of an elastic bar when it is 
subjected to buckling by a central force. Let a homogeneous (i.e. 
having the same properties in all its parts) elastic weightless bar 
be placed along the z-axis. Suppose that the bar is compressed by 
a force P and that its ends can rotate about the points of support 
which are permanently kept on the z-axis (see Fig. 295a). When the 
force increases and attains a certain critical value Per the bar buckles 
and takes the form depicted in Fig. 295b. Let us denote the trans- 
verse deviation of a point x of the bar from its original position 
by y. As it is proved in courses on strength of materials, the function 
y (a) satisfies, within a sufficient accuracy, the differential equation 

and the boundary conditions 


y +g y= y (0) =y()=0 (101) 


where Æ is the so-called modulus of elasticity (also called Young’s 
modulus after T. Young, 1773-1829, an English physicist, physician 
and astronomer, one of the creators of the wave theory of light) 
and J is the moment of inertia of the cross section of the bar. As 
follows from (100), there is a basic case for problem (101) when 


4, <(+)’ (102) 


= sin Vil 


Fig. 295 


l 
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This means that if condition (102) is fulfilled the problem has only 
the zero solution, that is there is no buckling in this case. When 
inequality (102) turns into the equality, as P increases, there appears 
a singular case. Then, besides the zero solution, problem (101) 


possesses solutions of the form y =C sin = x where C is an arbitrary 


constant. In this case there are no forces that can keep the bar in 
the rectilinear equilibrium state and therefore even arbitrarily 
small external actions can result in finite deviations from this 
state. This means that the bar becomes unstable. The expression 


Po Bd ESj 


of the critical force was found by Euler in 1757. One may think that 
after a further increase of the force, when we have P > Per, the 
bar will again become rectilinear but this conclusion is incorrect. 
The fact is that equation (101) is applicable only to the limiting 
case of small deviations from the rectilinear state. A more compre- 
hensive investigation of the exact non-linear equation deseribing 
the state of the bar for any deviations shows that after P exceeds 
P., there appears a new curvilinear equilibrium state which is 
stable and which exists together with the unstable rectilinear equilib- 
rium state. The curvature of the curvilinear equilibrium form rapidly 
increases, as P increases, and finally the bar breaks. 

The notion of an influence function (Green’s function) introduced 
in Sec. XIV.26 can be applied for solving the non-homogeneous 
equation with the homogeneous boundary conditions 


y+ pi)y +a@y=fe) @<r<bd), 
yla) =0, y(b)=9 (403) 


in the basic (non-singular) case. Indeed, we can interpret the func- 
tion f (z) as an “external” action. Then y (x) is interpreted as a result 
of the action, i.e. y (z) = f (z). The superposition principle is also 
applicable here (why is it so?). : 

According to Sec. XIV.26, let us denote the solution of problem 
(103) in which the delta function 6 (x —&) substitutes for f (x) 
by G (x, £); then the solution of problem (103) for an arbitrary 
function f (z) is expressed in the form 


b 
y()= | FOG, Da (104) 


We now take a simple example. Consider the problem 
y'=f@ Ox<z<J, yO-yO=0 (105) 
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If we substitute 5 (x — Ẹ) for f (x) we simply obtain y” = 0 for 
O<a<E and for E< T< l. Thus the solution is 
y= azt b for O0<2r<€& and 
y=critd for ILLS l 


where a, b, c and d are some constants. The boundary conditions 
imply that b = 0 and cl + d = 0, i.e. 

E A P AE O<x<& and 

y = c¢ (z.— |) for Ext r<l (oe) 

If we integrate the equality y” = 6 (c — £) from z = § — 0 to 

z=E+0 we get y Et 

+ 0) — y (& —-0) =1. By the 

way, the result would be the 

same for the left-hand side of 

equation (103) because integra- 

ting a finite function over an 

interval of zero length results in 

sal W _;,; zero. The repeated integration of 

ky=-tan f, ktang, kæk=f the delta function yields a con- 

Fig. 296 tinuous function (see Sec. XIV.25) 

and therefore we have y (§ — 

i —0) = y(€+0). Hence we 

obtain o —a—14 and af =c(&—/J) from (106). It follows that. 


& 


pe 
a= em c= > 


Pa l 


Substituting the values of a and c into (106) we find the influence 
function of problem (105): 


sL for O<a<E 


Gila IS 
a for E<r<l 
The function is shown in Fig. 296. On the basis of formula (104) 


we obtain the formula of the solution of problem (105) valid for 
any function f (zx): 


L x 
y=) Glz, 9/@d= | Gl, 910+ 
0 0 


x 


l 
+{e@,e1@a=—* | YOt- 


0 


l 
—+ | 0-9/0% 
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§ 5. Linear Equations with Constant Coefficients 


Linear equations with constant coefficients form an important 
class of differential equations whose integration can be completed 
in a comparatively simple way. 

17. Homogeneous Equations. For definiteness,» we now consider 
the third-order equation 


z" + az" + az' + az = 0 (107) 


where a4, @ and a, are constants. Euler’s idea is to seek a particu- 
lar solution of the equation in the form 


z= e (108) 


where p is a constant that must be appropriately chosen. 
Substituting (108) into (107) we see that 


e (p? + ap? + aP + a3) = 0 
The first factor being unequal to zero, we obtain 
pP? + up? + ap + a = 0 (109) 


Thus, a function of form (108) satisfies equation (107) if and only 
if p satisfies equation (109). Algebraic equation (409) in the un- 
known quantity p is called the characteristic equation of equation 
(107). The left-hand side of the characteristic equation (109) is 
called the characteristic polynomial of equation (107). The degree 
of a characteristic equation is equal to the order of the corresponding 
equation. 

Equation (109) has three roots (see Sec. VIII.8) which we denote 
as pı, p2 and ps. There can be different cases here, namely, the follow- 
ing ones. 

1. Let all the roots be real and simple (that is distinct from each 
other). Then, by formula (108), we have three particular solutions 
of equation (107) of the form 


Zy == eP*, By = erat and. Z = ePs* 


The solutions being linearly independent (i.e. none of them being 
a linear combination of the rest), the general solution of equation 
(107), as it is implied by Sec. 14, has the form 


z = Ceris + Coert + Caers* (110) 


2. Let all the roots be simple but let there be imaginary roots 
among them. Then there appear complex functions of a real argu- 
ment on the right-hand side of formula (110) (see Sec. VIII.6). But 
it is easy to show that the whole theory of linear equations (see § 4) 
can be automatically extended to the case when all the coefficients 
are complex functions or numbers and all the solutions are complex 
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functions of a real argument. Formula (110) can therefore be applied 
to the case of imaginary simple roots of equation (109). It is apparent 
that the arbitrary constants are also complex here in the general case. 

But when we consider real functions it is also preferable to obtain 
the answer in the real form. To attain this we can take advantage 
of the following assertion: if a homogeneous linear equation with 
real coefficients has a complex particular solution the real part 
and the imaginary part of the solution are solutions of the same 
equation. Actually, if L [y, + iy2] = 0 (see Sec. 14 on the notation) 
then L [yı] + iL [y2]= 0 from which we deduce L {i y, ]=0 and 
L [y2] = 0 (why?). 

Hence, if the coefficients of equation (107) are real and if it has 
a particular solution 

e(r-+is)x — e"®¥ cos sx + ie™ sin sx 
[see formula (VIII.12)] then the functions 
e cossz and e™sin sx (444) 

are also solutions of equation (107). Since complex roots of an alge- 
braic equation with real coefficients form conjugate pairs of complex 
numbers (see Sec. VIII.8), in our case 2 there are two roots of the form 


Pr=r+is and pp=r—is 


whereas the third root pa is real. Hence, the general solution can 
be written in the form 


z = Cye™ cos sx + Cpe" sin sx + Cae”: (112) 
instead of (140). 

For instance, taking equation (88) which describes free oscilla- 
tions we obtain the characteristic equation p? + o3 = 0 with the 
roots pi = +i0o = 0 + ioo. Thus, the general solution of the 
equation is analogous to formula (112): 


z = Cye°' cos wot + Celt sin wot 


Thus, we have obtained the solution of the equation in form (89). 
Formula (112) is sometimes rewritten in another form by taking 
advantage of the formula 


C, cos sz -+ Cy sin sx = M sin (sx + @) 
To obtain this expression we must put 
Cy,=Msina, C,=Mcosa, M=VC?+C; and tan 0 = 
(compare with Sec. 1.29). Then, instead of (112), we get 
z = Me™ sin (sx + a) + Cers" (418) 
where the role of arbitrary constants is played by M, a and C3. 
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3. Let there be multiple roots among the roots of characteristic 
equation (109). For example, let ps = pı and p3=4 pı. Then, of 
course, formula (110) does not yield the general solution since for- 
mula (108) gives only two distinct solutions in this case (i.e. eP:* 
and e?3*), 

To find a third solution we begin with the case when ps = py + Ap 
where | Ap | 0 is small. Then equation (107) has the solution 


ep2% — ePixeAP-* — erix (1 Nose Cee pe .) 


together with the solution e?!* [here we have used expansion 
(IV.55)]. Therefore the linear combinations 


CAA 


eP2* — ePiX — ePix (Ap: 7 


and 


pox Pix “72 
eae =e (z+ Api +) (114) 


are also solutions of the equation. 

After dividing by Ap we can pass to the limit, as Ap > 0. Then 
all the terms containing Ap tend to zero, and therefore we obtain 
the solution ze?!” for Ap = 0, that is for pz = pı- Hence, in this 
case equation (107) has the general solution of the form 


2 = Gye = Coxeri® + C3ePs* 


Similarly, in the case pı = P2 = P3 equation (107) has the solu- 
tion ePi* and, besides, the solutions ye”: and x%eP1*, To prove this 
we can take the second divided difference instead of (414) and pass 
to the limit (see Sec. V.7). Therefore the general solution for this 
case has the form 


z = Ceri + Comer™ + C gx erix 


Equations of an arbitrary order are investigated in a similar way. 
If a root p of a characteristic equation has a multiplicity Æ then 
the functions 
ere, T e ARNO ia Laa 
are particular solutions of the corresponding differential equation. 
If a pair of complex conjugate roots r + is is of multiplicity k then 
the functions 

e™ cos sz, e™ sin Sz, PACOS sx, OLE SIN SZ e .: 


_ atte" cos sz, 2x" te'™ sin sx 


are particular solutions. i NEN i 
Thus, the only practical difficulty in integrating a homogeneous 
linear equation with constant coefficients lies in solving the corres- 
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ponding characteristic equation which can be performed by means 
of the methods described in Secs. V.2-5 and VIII.9. 

As an example, we now consider free oscillations of a material 
point when there is a linear law of elasticity and an additional force 
of viscous friction which is proportional to the first degree of the 


velocity. In this case we must add the term Lug to the right- 
hand side of equation (3) where f is the coefficient of viscous friction. 


2 


Fig. 297 


After transposing all the terms to the left-hand side and dividing 
by M we obtain the equation 


2" + 2hz' + oz = 0 (145) 
similar to (88) where 2h = 2 and o, = ae 
In solying the characteristic equation 
p? + 2hp + o} = (146) 


we can encounter two basic cases. If h < wo, that is if the friction 
is comparatively small, equation (116) has the solution 


Pyo= — h Vo= —heiVo—h? 
and the general solution of equation (115) is therefore analogous 
to (143): 
z (t) = Me" sin (wt.+ a) 


where œ = V œ? — h. 

We see that the presence of small friction yields damped oscilla- 
tions whose amplitude decreases according to an exponential law 
(because of the factor eht) Besides, the friction decreases the fre- 
quency (since œ < @o). The graph of the solution is depicted in 
Fig. 297. Zeros of the solution are defined by the factor sin (ot + % 


and therefore they are equally spaced. After the time period T =~ 


elapses the sine repeats its value and the function e»! “educes” the 
2n 


=h a 
factor e © which causes damping. The time interval 7 is often 
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called the “period” of the oscillations although the function z (t) 


is a non-periodic function here because during each subsequent 
an 


“period” the amplitude of z decreases e Oo times (whereas the general 
form of the function does not change). 

If h > Ov, that is if the friction is comparatively large, equation 
(116) has real roots, and equation (415) has the general solution 


z(t)=C eft? + Cat = Ce- t- Vina—08) t Cpe- t+ Vi-o t 


Only the first summand is essential here for large t (why?). Hence, 
we obtain an exponential damping without oscillations in this case. 
This is the so-called aperiodic damping. 

Theoretically, there is a possibility of a “bordering” case when 
h = @o. Then equation (116) has a double root. We leave it to the 
reader to verify that in this case we also have aperiodic damping. 

18. Non-Homogeneous Equations with Right-Hand Sides of Spe- 
cial Form. We now turn to non-homogeneous linear equations with 
constant coefficients. For instance, let us take a third-order equation 
of the form 

y" + ay” + aay! + asy = f(z) (a, az, ag = const) (117) 

The corresponding homogeneous equation being always solvable 
(see Sec. 17), the considerations of Sec. 15 imply that it is only a 
single particular solution of equation (117) that we have to find. For 
the right-hand side of the general form this can be achieved by the 
method of variation of arbitrary constants (see Sec. 15). But there 
exists an important and a rather wide class of right-hand sides of 
a special form for which a particular solution can be found consi- 
derably faster and simpler by applying the method of undetermined 
coefficients. 

We first take the equation 


y" + ay” + ay! + ay = Kel (K, = const) (418) 

It is natural to seek a particular solution of the equation in the form 
y = Ae (419) 

where the constant A is yet unknown. Substituting (419) into (118) 
we obtain 

ARs + Aae + Aae + Aag = Ke 
This results in 
and y= 


K 
ro” (120) 


pe Neel nee 
= FB ayh® + agi + as 
(after cancelling out the factor e^") where P (A) designates the cha- 
racteristic polynomial. 
35-0144 
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The result is valid if P (A) #0, that is if A is not a root of the 
characteristic equation. If P (A) = 0 then function (119) satisfies 
homogeneous equation (107) and therefore it cannot satisfy equa- 
tion (148). 

Let P(A) = 0 and P’ (A) #0 which means that A is a simple 
root (see Sec. VIII.8) of the characteristic equation. Then, following 
the same way of reasoning as in Sec. 17 when we considered the case 
of multiple roots, we substitute A, = A + æ for A into the right- 
hand side of equation (118) where |œ | is small but different from 
zero. The value A, is no longer a root of the characteristic equation 
and therefore formula (120) suggests that equation (118) has a par- 
ticular solution of the form 


K WS K aan? 
ee gil Ne 4 i 
Pay’ PAFA CAE ra ens (1 | aa z] | ay 

ra SSI a aun 
= Aa SRN aa cae 
RET NT ETA ere (24 ap sv) 


The first summand on the right-hand side of the last relation 
satisfying the corresponding homogeneous equation, the second 
summand is also a particular solution of equation (448) in which 
A = à (why is it so?). We now pass to the limit-in the second sum- 
mand as æ > 0. Applying L’Hospital’s rule to calculating the limit 


ua PUET we obtain the expression 
e Aa zers (121) 


(check it up!) which is a particular solution of equation (118) for the 
original value of A. ` 

In a similar way we investigate the case when A is a double root 
of the characteristic equation and find that in this case there is 
a particular solution of equation (118) of the form 

y= pig ote 

and so on. 

By means of a similar but a little more complicated procedure we 
can prove that the equation 


y" + ay” + aay’ + asy = Qm (x) e* (122) 


where Qm (x) is a given polynomial of degree m possesses a particu- 
lar solution of the form 
Y = Rm (x) às (123) 


provided A is not a root of the characteristic equation [Rm (x) denotes 
some other polynomial of degree m]. The solution can also be found 
by means of the method of undetermined coefficients. To apply the 
method we write expression (123) with literal coefficients and sub- 
stitute it’ into (122). Then the usual procedure of equating coeffi- 
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cients in similar terms, based on the identical equality of both 
sides of the relation, yields the numerical values of the coefficients 
entering into (123). Similarly, if 4 is a root of the characteristic 
equation of multiplicity & then there is a particular solution of the 
form 

y = P Ra (a) e* 

The case A = 0 is not excluded here and therefore we can also 
obtain a particular solution when there is a polynomial of degree 
m on the right-hand side of equation (422) (without an exponential 
factor). 

We can also consider the equation i 


y" + ay” + azy’ + asy = Qm (x) et" cos va 
Since we can rewrite the right-hand side in the form 


o YRAN PONE 


Qm (x) e!" 5 = Sn etin sp Sm (E) o(u—iv) x 


(see Sec. VIII.4), formula (123) suggests that if A = u = iv is not 
a root of the characteristic equation, a particular solution can be 
sought for in the form 


y = Ry (x) etin® Rpm (a) e0 = 
= R,, (x) e”* (cos vz + isin va) + Rm (2) e” (cos vz — isin vz) = 
= [Rn ie Ry (a)) et cos va + [iRm (2) — iR m (2)) e4* sin vz 
Changing the notation we arrive at a particular solution of the form 
y = Tm (2) e" cos vz + Sm (x) e#* sin ve (124) 
where Tm (x) and Sm (2) are polynomials of degree m which can 
be found with the help of the method of undetermined coefficients. 
In the case the right-hand side is of the form 
Qm (x) et~ sin vz or Qm (x) e= cos (va + a) or 
Qm (2) e!" sin (va + a) (a% = const) 

we can also seek a particular solution in form (124). 
If à = u + iv is a root of the characteristic equation of multi- 
plicity & the right-hand side of formula (124) should be additionally 


multiplied by at ? 5 i i s 
For instance, let us consider equation (87) with-a sinusoidal right- 


hand member K sin of: 
y’ + oy = K sin ot (125) 
The equation describes forced oscillations under the action of an 
external sinusoidal force of frequency w. 
According to formula (124), if A = +iw does not coincide with 
any root of the characteristic equation, i.e. if œ = @o, a particular 


35* 
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solution can be found in the form 
y = A cos ot + B sin ot 
Substituting this expression into equation (125) we obtain 


—Aw? cos ot — Bo? sin ot + Ao, cos wt + 
+ Bo? sin ot = K sin ot 
The last equality being an identity, we must have 
—Ao? + Ao? =0 and —Bo? + Bot = K 


Hence, A = 0 and B = , and thus we obtain 


7 
o2—@2 


K 8 
y= oor N ot (126) 


To obtain the general solution of equation (125) we must add 
the general solution of equation (88) to expression (126). Consequent- 
ly, if the frequency of the external force is different from the funda- 
mental frequency of oscillations we have a superposition of two 
harmonic oscillations of forms (89) and (126). Function (126) 
describes the so-called forced oscillations whose frequency coincides 
with the frequency of the external action and whose amplitude and 
phase have completely specified values. Function (89), which is 
a solution of equation (88), describes free oscillations whose am- 
plitude and phase depend on the initial data. : 

Formula (126) shows that when œ = qo and is close to mo the 
amplitude of forced oscillations becomes very large. But if œ = 0 
then, according to the general theory, a particular solution of equa- 
tion (125) can be found in the form y = At cos Wot + Bi sin wol. 

The substitution of the last expression into the equation results 
in the formula 


pane ey t 
j= Io COS Og 


which can also be deduced from (126) by analogy with formula (121). 
(Let the reader verify the validity of the above expression.) 

We sce that if the frequency of a harmonic external action is 
equal to the fundamental frequency of oscillations the amplitude 
of forced oscillations increases according to a linear law. This impor- 
tant phenomenon is well known in physics and engineering and is 
called resonance. 

19. Euler’s Equations. Euler’s equations are of the form 


(ax + db)” yO + ay (ax + BYP Y™P +... + 
+ ana (ax + b) y' + any = f (2) 


where ai, Qz, ...-, An are constant coefficients. 
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* Euler’s equation can be easily reduced to a linear equation with 
constant coefficients by means of the change of the independent 
variable 
jar +b|=e', ie. t= In |ar +b | 
For the sake of simplicity, let us suppose that ax + b >O and 
take a homogeneous second-order equation: 


(ax + b)? y" + a (ax + b) y' + ay = 0 (127) 
After the independent variable is changed we obtain 
az+b=e', t=In(axr-+-b), (128) 
“nt dy _ dy dt ER W Se 
u= u aera” and 
yay’ dy! dt (Py tt! ae 
=X- A= (G4 APT ae ) ae 


Substituting the results into equation (127) we receive 


ay dy dy 
2 ee sy = 
a \ae it) Haa -gr + ay =0 

This is an equation with constant coefficients which can be solved 
by means of the methods described in Sec. 17, i.e. we must put 


y= eo (129) 


Then we solve the corresponding characteristic equation etc. and, 
finally, turn back from ¢ to 2. 
It is possible to avoid substitution (128) because (128) and (129) 
imply that 
y = (az + 0)? (130) 


We can therefore directly substitute (130) into (127), i.e. we can 
seek a solution in form (130). This yields a characteristic equation 
for determining p, the degree of the equation being equal to the 
order of equation (127). But one must take into account that, in 
case there are multiple roots of the characteristic equation, equation 


(127) possesses solutions of the form 
y = te?! = (ax + b)” In (ax + b) 


besides solutions of form (130) (the form of these additional 
solutions depends on the multiplicity of the root; see case 3 in 
Sec. 17). 

20. Pe and the Operator Method of Solving Differential 
Equations. The notion of an operator was introduced in Sec. XIV.26. 
We also considered some examples of operators including the ope- 
rator of differentiation D. In Sec. 14 we introduced a differential 
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#perator L. Further examples of operators are the shift operator sh 
and the difference operator A which are defined by the formulas 


Tf (2) =f (œ +h) and Af (2) = f (z +h) — f (2) (131) 


where h is a given step. We also-introduce the operator of multi- 
plication by a number C which we denote by the same letter, C (in- 
cluding the unit operator 1 which does not change a function and 
the zero operator 0 which transforms any function into the function 
which is identically equal to zero). There is also an operator of 
multiplication by a given function and so on. 

_ Operators can be added together and multiplied by numbers ac- 
‘cording to the following rule which looks quite natural: if A and 
B are operators and œ is a number then 


(A + B)f=Af+ Bf and (aA) f = a (Af) 
For instance, equality (131) suggests that 
A=T—1 and T=1+A 


All the axioms of linear operations hold for these rules of addition 
of operators and multiplication by a number (see Sec. VII.17). 

We can multiply an operator by another one according to Sec. XI.6: 
if A and B are operators then AB is a new operator defined by the 
formula 


(AB) f = A (Bf) 


which means that to obtain ABf we must first apply the operator 
B to the function f and then apply operator A to the result. We can 
easily verify the following rules: 


A (BC) = (AB)C and («A+ $B)C=aAC+ PBC (432) 


where «, B = const. But the equality AB = BA may not hold, 
that is in the general case the result of performing two operations 
may depend on the order they are performed. But nevertheless in 
particular cases we can have AB = BA and then we say that the 
operators commute with each other. For example, all the above 
operators D, T, A and C commute with each other because we have 


DTf (2) = D (Tf (2)) = D (f @+h) =f (£ +h), 
TDf (z) = T (f (2)) =f @ + hk) ete. 


On the other hand, the operator of differentiation of a function and 
the operator of multiplication of a function by a given function 
do not commute (the reader should verify it!). 

The first property (132) enables us to write ABC instead of (AB) C 
or A (BC). Thus, ABC is an operator whose action upon an object 
reduces to performing the operations C, B and A in succession. An 
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operator of the form ABCD is defined similarly and so on. If we 
take equal factors then we arrive at powers of an operator: AAD AE, 
A‘ etc. Hence, A” designates the repeated application of the opera- 
tor A. For instance, we have D*f = f”, and A°f is the second diffe- 
rence (see Sec. XI.6) etc. 

An operator A is called a linear operator if 

A (fi; +f) =A) FAG) and A (af) = aA (f) (133) 

where « = const (see Sec. XI.6). It is natural to interpret the first 
property as the principle of superposition, and the second property 
can be deduced from the first (see Sec. XIV.26). Even in the case 
when the explicit expression of an operator is unknown the validity 
of the superposition principle indicates the linearity of the operator 
which enables us to draw some useful conclusions; for example, we 
can speak about its influence function (see Sec. XIV.26) in such 
a case. 5 

Both properties (133) can be put down as 


A (afı + Bf) = Af + BAfe 
where œ and f are constants. It is easy to verify the following pro- 
perty of a linear operator A: 


A (aB + BC) = «AB + BAC (134) 


where B and C are arbitrary operators. To deduce the property one 
should apply both parts of (134) to an arbitrary function f and show 
that the results will be the same, namely, equal to aA (Bf) + 
+ BA (Cf). 

All the operators we have taken here as examples are linear. 
Indeed, as we know, the derivative of a sum equals the sum of the 
derivatives and so on. An example of a non-linear operator is the 
operator of squaring a function or the operator of forming the ab- 
solute value of a function (verify that the operators are non-linear!) 
and the like. In performing linear operations on linear operators 
and the operation of multiplying operators by each other we can 
use rules of elementary algebra but at the same time we must pay 
attention to the order of factors. For instance, (A + B= 
= (A + B) (A + B) = Ar AB T BA + B®. This assertion is 
suggested by (132) and (134). 

In addition, if the operators commute with each other the order 
of factors does not matter either. For example, in such a case we 
have (A + B)? = A? + 2AB + B? and so on. 

We can also consider power series (see Sec. IV.16) of operators. 
For instance, we can define the operator eå as 


2 
atpAts tat. (135) 
and so on. 
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Generally, such a series or any other operator cannot be applied 
to all functions because usually there is a certain class of functions 
for which an operator makes sense, and thus it should be used only 
for these functions. 

As for example (135), we see that the greater the number of the 
terms taken, the more accurate the result. Theoretically, the exact 
result is obtained only in the limiting process. From the practical 
point of view this means that the number of terms must be sufficient- 
ly large for the result to be regarded as exact. 

Taylor’s series 


Peth rot E224 FO m+ ... 
see Sec. IV.16) can be written in the form 
D D2 
T{= (1+ 4 b+ yet...) fof 
This implies a relation between the operators T, A and D, namely 


T=eD and A=eD —1 
The inverse formula 


1 1 4 Az A3 
D=—inT=+In(t+A)=— (A-F one] 
[which corresponds to formula (IV.61) of expansion of the natural 
logarithm] is nothing but formula (V.32) of numerical differentia- 

tion. 


We can consider an operator equation of the form 
Ay = f (136) 


where f is a given function and the function y is sought for. If there 
is a solution of equation (136) it is natural to denote it as y = AT. 
In case the operator is linear the equation is also said to be a linear 
equation. We can immediately extend properties 1-3 in Sec. 14 
and property 1 in Sec: 15 to general linear equations. But it should 
be taken into account that there are cases when a homogeneous 
equation has infinitely many linearly independent solutions and 
cases when a non-homogeneous equation has no solutions. 

We shall demonstrate the application of the operator of differen- 
tiation to solving linear equations with constant coefficients of 
form (147) (the so-called operator method of solving equations). The 
equation can be rewritten as 


(D? + aD? + aD + a) y = f (2) 
There is a linear differential operator of the third order with con- 


stant coefficients inside the parentheses. We can factor the operator 
into linear factors according to algebraic rules (see Sec. VIII.8): 


(D — py) (D — ps) (D — ps) y = f (2) (137) 
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where pı, pa and ps are the roots of the characteristic equation (see 
Sec. 17). Since 
D (ey) = (Py) = — pey + ey" = 
= em (y' — py) = e™ (D — p) Y 
we have 
(D — p) y = eD (ey) 
and therefore equation (137) can be rewritten in the form 
erıxD (e7Pix eP2x D (e772* ers* D (e-Ps* y))) = f (2) 
Transposing the factors from the left-hand side to the right-hand 
side and taking into account that the equality Dy = z is equivalent 
to the equality y = f zdz we obtain the general solution: of the 


original equation: 
y= ss j eaP) * (| ePiPa) * (| e”: f (2) dz) dz) dz 


It is apparent that after integration we obtain the same result 
as in Secs. 15 and 18 although we have used a different approach. 
There are more complicated problems for which the operator method 
may be essentially useful. We must note that such a simple facto- 
rization of an operator into a product of several operators of lower 
order can rarely be applied effectively to linear differential operators 
with variable coefficients and moreover to non-linear operators. 

We now consider one more simple example. Let it be necessary 
to find a solution of the equation 

y" + oy =f (2) 
where all the quantities are regarded as being real. We,write, in 
succession, 
(D?+0*)y=f (2), D— io) (D+ io) y =f (2), 
cios D (ee (D+ i0) y) =F (2), À 


(D+ iw) y = ea f e-iox f(x)dx and y= Im ae jee] (x) da) 


(the sign Im designates the imaginary part according to the notation 
introduced in Secs. VIII.1 and VIII.6). Finally, 


ps f e=ionf (x) dz) 


§ 6. Systems of Linear Equations 


21. Systems of Linear Equations. For definiteness, let us consider 


a homogeneous linear system of three first-order equations in three 


unknown functions y (z), 2 (2) and u (z). We take the system in the 
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form solved for the derivatives of the functions: 


y' = a (z) y + by (2) 2+ c (x) u 
' = a, (x) y + by (2) z + ca (2) u (138) 
az (x) y + bs (z) 2+ c3 (x) u 


We remind the reader that a system of any order can be reduced to 
a first-order system (see Sec. 11) and that the operation of resolving 
a system with respect to the derivatives is performed algebraically 
without solving the differential equations. j 

We can easily pass from system (138) to an equivalent third-order 
equation (see Sec. 11) which is also a homogeneous linear equation. 
Therefore all the properties of homogeneous linear equations enume- 
rated in Sec. 14 are extended to system (138). The sum of two solu- 
tions y = yy, Z = Zy U = U and y = Yo, Z = Zy, U = Uy Should be 
understood as a new solution of the form y = y1 + Yo. 2 = 41 + Zo 
u = u + Ug. Similarly, the product of a solution y = yi, Z = 41 
u = u, by a constant C is the new solution y = Cy,, 2 = Cz, u = 
= Cu,. Hence, linear operations on solutions are performed here 
in the same way as on vectors (see Sec. VII.10). 

In particular, the general solution of system (138) has the form 


| 


2N 
lll 


y = Cy, + Coya at Cys 
miO NCZ, F C yay (139) 
= Ciu + Cola + Colg 
where C,, C, and C, are arbitrary constants and (y,, Zi, u1)» (Yor Za 
ua) and (Yg, Z3, Wg) are three linearly independent solutions of system 


(138), that is such solutions that none of them is a linear combination 
of the rest. 


Let us dwell in more detail on property 4 in Sec. 14. If a nonzero 
solution (y4, 21, u) of system (138) is known then making the sub- 
stitutions y = yy, 2 = 42,-u = wu and y =u + v, Z=ut+w 
we easily derive a system of two equations of the first order in two 


unknown functions v (z) and w (2) from which w is found by a single 
integration. 


The verification of the last assertion is left to the reader. 
All the properties enumerated in Sec. 15 are also extended to 
non-homogeneous linear systems of the form 


y = a (2) y + b; (2) z+ c; (z) u + fi (2) 
a! = dy (2). y + by (z) 2 + cs (x) u + fa (2) (140) 
u' = ds (2) y + bg (2) z + ca (x) u + fs (2) 


In particular, the method of variation of arbitrary constants (Sec. 3) 
is applied in the following manner. Let general solution (139) of the 
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corresponding homogeneous system (138) be known. Then a solution 
of system (140) is sought in the form 
y = pı (2) Y1 + Pe (2) Y2 + Ps (2) Ys 
z= Q (2) % + Pa (2) 22 + Ps (2) 2s | 
u = Qı (2) Uy + Pz (2) Us + Ps (2) Us 
After substituting these expressions into (140) we obtain the system 
piyi + PY + Ps = fi (z) 
giz + Piza + Piz = fo (2) 
Pils + Pills + Pla = fs (x) 
(verify the calculations!). The derivatives pi; p, and gj are found 
algebraically from the last relations. Then, integrating, we obtain 


Pi, P2 and Ps. j} 
Homogeneous linear systems with constant coefficients are especially 


important. For example, let us take the system 
y' = ay + biz + cu 
z! = doy + baz + Cou (141) 
w = azy + baz + Cau 
in which all the coefficients ay, b;, - - -» C3 are constant. The method 
of solving such a system is similar to that of Sec. 17. Namely, non- 
zero particular solutions are sought in the form 
Yi Aer", 2 = pe”, WN (142) 
where A, p, v and p are unknown constants that must be determined. 
Substituting (142) into (141), cancelling out e”* and transposing 
all the terms to the left-hand side we obtain 
(a, —p)A+op+eav—90 )° 
ash + (ba — p) p+ ev = 0 (143) 
ash bau + (es — p) y = 9 
These equalities should be regarded as a system of three homo- 
geneous algebraic equations of the first degree in three unknowns 
à, p and v. For the system to have a nonzero solution (it is the only 
solution that we are interested in, according to the end of Sec. VI.6), 
it is necessary and sufficient that the determinant of the system be 
equal to zero. Thus, 
a— p bi Cy 
az be — P CoO (144) 
Ly aes bs C3— P 
This is the characteristic equation of system (141) from which we 
find all the possible values of p. 
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Equation (144) being an algebraic equation of the third degree 
in p (why is it so?), there are three roots pı, Ps and ps. If all the 
roots are simple we can substitute any of them into system (148) 
and find a nonzero solution A, u and v. Then formula (142) yields 
the corresponding solution y (x), 2 (x), u (x). The three particular 
solutions which correspond to p = Pi, Pp = P and p = ps enable 
us to put down the general solution of system (141) in accord 
with formula (139): 


y = CMe!” + Cahe”? + Cohge”3™ 
z= Cime”!” + Cope”? + Cage” (145) 


u= Cyv,e"!* + Cange?” + Cove?” 


where C,, C, and C, are arbitrary constants. 

If equation (144) has imaginary roots we can retain the solution 
in form (145) (which will contain complex functions of x) or, by 
analogy with case 2 in Sec. 17, write the solution in the real form. 

The case when equation (144) has multiple roots is more compli- 
cated. We shall not consider this case in the general form, but in 
concrete problems one can use the following procedure. For example, 
if p, is a double root particular solutions corresponding to p = pı 
should be looked for in the form 


y= (Ax + A) emt, 2= (ux + u) erix, 

u = (va + v) e: (146) 
in place of (142). Substituting (146) into (141), cancelling out ere 
and equating coefficients in the same powers of x we obtain equa- 
tions from which we find two different and independent sets of 
possible values of the coefficients A, X, ..., v. This results in two 
independent solutions of system (146). If a root of the characteristic 
equation is of higher multiplicity the corresponding particular 
solution of system (141) becomes more complicated. 

Matrices are widely used in the theory of linear differential equa- 
tions to simplify the form of writing such systems (see Secs. XI.1-2). 
To do this we usually rewrite system (138) in the form 


Ys = ty (2) Ys + tia (2) Yo + tis (2) Ys 
| (147) 


Il 


Ya = ao (2) Y1 + asa (2) Ya + azs (2) Ys 
Y, = ag, (£) Ys + aga (£) Ya + ags (T) Ys | 
and introduce the coefficient matrix 
au (£) Ay (x) a(z) 
A (x)= | G(x) ax (2) az (x) 


agı (T) Ago (x) asa (2) 
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yı 
y=] Ye 
Ys 


which is a one-column matrix. For our further aims it should be 
noted that if we are given a matrix B (x) = (bj; (x)) the rules of 
performing linear operations on matrices (see the beginning of 
Sec. XI.2) imply that 22 = (A) and hence B’ (x) = (bi; (z)). 
This means that to differentiate a matrix we must differentiate all 


its elements. It is also easy to verify that all the basic rules of diffe- 
rentiation (such as the formulas for the derivative of a sum or of 
a product) remain true for matrices. It follows that 


Ws 
y =|% 
A 


and therefore, after a manner of (XI.9), system (147) can be written 
in the matriz form 
y =A (2)y (148) 


Accordingly, non-homogeneous linear system (140) can be put down 
in an analogous form ; 


fi (2) 
y'=A(z)y+h(z), (Ff (2) -(; o) 
fs (2), 
Solution (139) can be rewritten in the vector form 
y = Gay’ + Cay? + Cay? 
where the indices designate the numbers of the corresponding 
linearly independent particular vector solutions of equation (148). 
System (144) turns into ; 


and the vector solution 


y =Ay (149) 
and its solution (142) is of the form 
y= ea (450) 


where 


Oy 
a= j| %& 
as 


is a constant vector. Substituting (150) into (149) we get 
pe?*a = Aa, that is Aw = pa 
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We see (compare with Sec. X1.4) that æ and p must be, respectively, 
an eigenvector and the corresponding eigenvalue of the matrix A. 
As it was shown in Sec. XI.4, an eigenvalue is found from the equa- 
tion 

det (A — pl) = 0 


which is nothing but equation (144). 

In solving systems of form (141) with constant coefficients (and 
also the corresponding non-homogeneous systems) we can apply 
the operator method (see Sec. 20). To do this we write y’ = Dy, 2’ = 
— Dz and u’ = Dw and then solve the resulting system of equations 
as an algebraic system in the unknowns y, 2 and u according to the 
methods of Sec. VI.4. But when doing this we stop performing 
algebraic operations when we arrive at formulas of type (VI.13) 
which are of the form P (D) y = f (x) in our case. Then we use 
the methods of Sec. 20 and thus complete the process of solving 
the system. The procedure described here is nothing but another 
form of the method of reducing a first-order system to one equation 
of higher order. 

22. Applications to Testing Lyapunov Stability of Equilibrium 
State. We understand the stability of an object, a state or a process 
as its ability to oppose external actions. It is often impossible to 
take some of these external factors into account beforehand. The 
concept of stability emerged as early as antiquity and now it plays 
an important role in physics and engineering. There are different 
concrete realizations of this general notion depending on the type 
of the object under consideration, on the character of external effects 
and so on. One of the realizations was discussed in our course in 
Sec. 16. Here we are going to consider the notion of the Lyapunov 
stability which is one of the most important forms of stability. It was 
introduced by the prominent Russian mathematician A. M. Lyapu- 
nov (4857-1918) in 1892. 

Let the state of an object be described by means of a finite number 
of parameters. For definiteness, suppose that there are three such 
parameters z, y, z. Then a process in which the object changes as 
time passes is determined by three functions z = x (), y=Y (OP 
z = z (t) (where ż is time). Let the law of the process be described 
by a system of differential equations of the form 


d: 


“=P (a ys 2) 
B= Q(z, y, 2) | (154) 
dz 
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where the right-hand sides which are regarded as being known do 
not contain the independent variable ż explicitly. The last condi- 
tion means that the character of the differential law of develop- 
ment of the process does not change as time passes. 

Suppose that an equilibrium state of the object, that is a state 
in which it does not change in time, is described by certain con- 
stant values £ = £o, Y = Yo, Z = Zo. Then these constants regarded 
as functions of the time must satisfy system (151). The direct sub- 
stitution of £o, Yo, Zo into (151) implies that for this to be so it is 
necessary and sufficient that the relations 


P (ao, Yor 20.) = 9, Q (£o, Yor Zo) = 9, 
R (zo Yo, 20) = 9 (452) 


should hold simultaneously. 

Suppose that at an instant fo the object is shifted from the equi- 
librium position, that is the parameters become equal to = zo + 
+ Azo, y = Yo + Ayo and z = zo + Azo. To investigate the further 
changes of the state of the object we must solve system (151) with 
the initial conditions 


x (to) = Lo + Azo, +y (to) = Yo + Ayo: 
Z (to) = 20 + Azo (153) 


The equilibrium state in question is said to be stable (Lyapunov 
stable) if after any infinitesimal displacement from the state the 
object remains all the time in an infinitesimal vicinity of the equi- 
librium state. In other words, the differences 


Az =2x(t)—%, Ay =y (Ù) — Yo: Az = z (t) — Zo 


corresponding to the. solutions of system (151) with initial condi- 
tions (153), for infinitesimal Azo, Ayo, Azo, must be infinitesimal 
over the whole time interval tọ < t< œ. 

At first sight one may find it strange that we consider infinitesimal 
deviations of the parameters and an infinite time interval because, 
practically, all the quantities are finite in reality. But here it is 
advisable to recall the difference between the mathematical and 
practical infinities (see Sec. 111.4). A practical infinitesimal quan- 
tity is a real quantity which is small relative to the scale of the 
process in question. Similarly, a practically infinite time interval 
is the interval of a transient process, that is a process of passing 
from the state under consideration to a state of a different type 
(for instance, the transition from a state of equilibrium to another 
state of ‘this type or from a state of equilibrium to the collapse of 
the object and. so on). Hence, in reality the Lyapunov stability 
means that. any small deviation from the equilibrium -state does 
not practically change the state. eae 


560 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


To test whether there is stability we substitute £ = To + Az, 
y = yo + Ay and z = Zo + Az into system (154) which results in 


E TA (zo+ Az, yot Ay, Zo + Az) = 
= (Pi) Ax + (Py)oAy + (Pa)o Az + + + 


ae =Q (a +Az, Yot Ay, Zo + Âz) = 


= (Qio Ax + (Qy)o Ay + (Q3)o Az +--+ 


a =R (o+ Az, yo + Ay, zo+ Az) = 


= (Ri) Ar + (Rio Ay + (Rao Az + +++ 


where the notation (Px)o = Px (£o Yor Zo) etc. is introduced. In 
transforming the right-hand sides we have used Taylor's formula 
(X11.17) and formulas (152). Here the dots designate the terms 
of the order of smallness higher than the first. 

When investigating the stability we consider only small values 
of Az, Ay, Az and therefore the most important role in the right- 
hand sides of system (154) belongs to the linear terms that are put 
down there. Therefore we replace system (154) by a truncated system 
(a system of the first approximation) which is obtained by dropping 
the terms of higher order of smallness and which has the form 


LAD — (Ph) Az + (Pi)o Ay + (Pao Az ) 
RA o = (Qs)o Aa + (Qy)o Ay + (Q3)o Az (155) 


d(A : ; i 
K = (Rix)o Az + (Ry)o Ay + (Ri)o Az 5 


System (155) is a linear system with constant coefficients which 
can be solved by the method of Sec. 21. According to formula (145) 
(in which, of course, we had different notation) the general solution 
of system (155) is a linear combination of functions of the form. 
eet where p satisfies the characteristic equation 


(Px)o— P (Py)o (P2)o 
(Qz)o (Qy)o— P (Qo |=0 (156) 
(Rx)o (Ry)o (Rz)o— P 


To small Azo, Ayo, Azo there correspond small values of the ar- 
bitrary constants Cy, Co, Cs. Therefore the behaviour of a solution, 
as t —> 00, is completely determined by the behaviour of the functions 
et. If p =r + is (the case s = 0 is not excluded here) we have 
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| eP”! | = e' [see formula (VIII.13)] and hence 
je" ltro —>0 for r<0 and Je? liso > for r>0 (157) 


Consequently, we, arrive at the following conclusions: if all the 
roots of characteristic equation (156) have negative real parts (in 
particular, they may be negative real numbers) the state of equilib- 
rium Zp, Yor Zo is stable. Besides, in this case we have z (t) > Zo, 
y (t) > yo and z (t) > Zo for small Aro, Ayo, Azo, as t— o0. In such 
a case we say that the equilibrium state is asymptotically stable. 
But if there is at least one root with a positive real part the state 
of equilibrium is unstable. 

We have derived these results from system (155) but according 
to the above considerations the same assertions are true for the 
complete system, i.e. system (154). It should be noted that our 
conclusions also remain true for the case of multiple roots of equation 
(156) although then we can have powers of ¢ as factors entering 
into the solution. In fact, exponential function (157) approaching 
zero, for r < 0, faster than any negative power of t, the above asser- 
tion appears evident. 

Both conclusions obtained above do not include the case when 
fhere are no roots with positive real parts among the roots of equa- 
tion (156) but there is at least one root having a zero real part. In 
this case there appear functions of the form 


et —cosst+isinst (le |=1) 


in the general solution of system (155). This implies that the object 
is likely to oscillate about the equilibrium position or to remain 
motionless near this position without approaching it. But the time 
interval being infinite, the terms of higher order of smallness that 
have been dropped begin to influence the process noticeably, and 
this can violate the stability. It is therefore impossible to arrive 
at any conclusions on the stability or instability of the state of 
equilibrium in this special case judging by the roots of equation (156). 
To investigate the stability in such a case we must involve some 
additional considerations; for instance, we can try to consider sub- 
sequent terms in expansions (154). 

In the case when the changes in our object are described by means 
of one function z (t) satisfying the differential equation 


f= fla) (158) 


the above results are especially simple. We see that if f (xo) = 0 
and f’ (xo) < 0 the value z = 2p corresponds to a stable equilibrium 
state, and if f (£o) = 0 and f’ (xo) > 0 the state is unstable. [We 
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suggest that the reader should draw this conclusion on the basis 
of the disposition of the isoclines of equation (158) in the ¢, x-plane.] 

There are many books on the theory of stability. We refer the 
reader to a comprehensive course [31]. 


§ 7. Approximate and Numerical Methods of 
Solving Differential Equations 


We often cannot integrate a differential equation exactly by re- 
ducing it to quadratures. Then we should apply other methods for 
constructing a solution. We have already described the simplest 
graphical method for solving first-order equations (see Sec. 3). 
Here we are going to present some methods for constructing appro- 
ximate formulas of solutions which are analogous to the methods 
of solving finite equations described in § V.1. We shall also discuss 
some numerical methods of solving differential equations which 
yield a sought-for particular solution represented in a tabular form. 
For the sake of simplicity, we shall consider first-order equations 
but the same methods can naturally be extended to equations of 
an arbitrary order and to systems of equations. 

23. Iterative Method. Let us take a first-order differential equa- 
„tion with a given initial condition of the form 


y =f, y) y (£0) = Yo (159) 
Taking integrals of both sides of the equation we obtain 


x x x 


jy dx=y—yo= J f(z, y)dz= | f(z, y @) de 


xo xo xo 


Changing the notation for the variable of integration we write 


x 


y(e)=yo+ | f(s. y(a))as (160) 


xo 


Equation (160) is equivalent to both equalities (159) because if 
we differentiate it we obtain the first equality and if we substitute 
z = z in it we obtain the second one. Equation (160) is an integral 
equation since the unknown function appears under the sign of 
integration in the equation. 

The form of equation (160) is convenient for applying the iterative 
method to it [compare with equation (V.9)] although here we have 
an unknown quantity that is a function but not a number. Now 
we must choose a certain function yo (z) as a zero approximation 
(initial approximation). It is desirable to choose it so that it should 
be as close as possible to the sought-for solution. If we have no 
information about the solution we can simply put yo (x)= yo. The 
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first approximation is then found by the formula 


ys (2) =yor \ f(s yo (9) ds 
Xo 
Substituting the result into the right-hand side of (160) we obtain 
the second approximation and so on. Generally, 
Ynn ()=Yo +S re, yn(s))ds (n=0, 1, 2,...) (164) 


xo 


As in Sec. V.3, if the iterative process converges, that is if the succes- 
sive approximations tend to a certain limiting function, as n increa- 
ses, this function satisfies equation (160). 
To verify this we can pass to the limit in 4 
equality (161) as n—> oo. 

It is oe that the iterative meth- Hix) 
od applied to equation (160) usually con- 


verges for all z which are close enough i l x 
to zo. At any rate, this is so if the condi- HA 
tions of Cauchy’s theorem (see Sec. 3) are Tear na 
fulfilled. The convergence is due to the fact 4 : | yf fisds 
t HARA Ea 
nN | 


that in calculating subsequent approxima- 
tions we integrate the preceding ones and 
that successive integrations “smooth” the 
function and gradually eliminate various 
irregularities which have been brought in 
by the choice of the zero approximation, 
by the round-off errors and the like. In 
contrast to it, differentiation of a function, 
asa rule, “worsens” the function and increa- 
ses the initial irregularities. Therefore an 
iterative method based on successive diffe- 
rentiations is likely to diverge. 

The difference between integration and 
differentiation is illustrated in Fig. 298. 
There is a disturbance of the function depi- 
cted in the upper part of Fig. 298. We see 
that this essentially changes the derivative of the function which 
is shown below (what is the form of the graph of the second 
derivative?). At the same time we see that the integral is changed 
very slightly. : : W 

For example, let us consider a particular case of Riccati’s equa- 
tion (see the end of Sec. 4) of the form 


yao+y 


Fig. 298 


36* 
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with the initial condition y (0) = 0. After integrating we have 


x 


3 
y= + J vO) as 
0 
We have no information about the desired solution and therefore 
we choose the zero function Yo (x) =0 as a zero approximation 
which, at any rate, satisfies the initial condition. Then we obtain 
(verify the calculations!) 


x 
3 23 3 \2 z3 z1 
naz, =i tf (>) este 
0 


x3 x 2x11 wld 
nla) =z +33 + 5079 + 50535 


and so on. We see that for small z (for instance, for | xz | < 4) the 
T 


process, converges fast. Really, we can put y = = + a for|z|< 1 
with an accuracy of 0.001, and for | z | es the same accuracy 


is attained if we simply set y= = F 


The question as to what is the approximation at which we should 
stop the iterative process is answered by comparing subsequent 
approximations with the preceding ones. 

24. Application of Taylor’s Series. Differentiating equation (159) 
and using the initial condition we can find the values y’ (£o); Y” (xo) 
etc. Therefore we can form an expansion of the solution into Taylor’s 
series (see Sec. IV.16). The necessary number of terms that guaran- 
tees a chosen degree of accuracy is found by successive calculation 
of the terms and their comparison with the degree of accuracy. 

For instance, let us take the problem 


y=2+y, yO) =1 
Substituting z = 0 into the right-hand side of the equation we find 
y 0) =O 442 =4 
Differentiating both sides of the equation we obtain y" = 2x t 
+ 2yy' and then, substituting z = 0, we derive 
y (0) = 2-04 2-4-4 =2 
We likewise find 
y” = 2+ 2y'? + yyt, y” 0) = 8, ylY = eyy + 2yy" 
WV (0) = 28 
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and so on. Substituting the results into Maclaurin’s formula (IV.54) 
we obtain 


y=y¥ 0+4 RAUPA = 
atte+etietgaot... 


The formula can be used for small values of | x i 

25. Application of Power Series with Undetermined Coefficients. 
This method is closely related to Sec. 24. According to the method 
we seek a solution of a given equation in the form of a series with 
unknown coefficients of the form 


y =a b(t) +e (e— 2) +d (e— 4) +--- (162) 


The coefficients are found after we substitute the series into the 
equation and equate the coefficients of the same powers of x (and use 
the initial conditions if there are any). 

For example, let us consider Airy’s equation 


y" —zy =0 (163) 


We shall look fora solution in the form of an expansion into powers 


of a: 
y = do + ax + age? + ast? +--- (164) 


After (164) is differentiated and substituted into the equation 
we obtain 


(1-2a, + 2+3agx + 34a +...) —2 (ao + aye + aat? + ae 
Equating coefficients in equal powers of x we derive 


1-2a, = 0, 2-Baz — a = 0, 3-4a, —a, = 0, 
4.5a, — dy = 0, 5-6a, — a3 = 0, 


from which we find, in succession, 


dg 1, aoe gee gM ots 
a=; 5-33-56’ 97 6-7 34-67" as= 7.3 9 
fay ao 
a= 3.9 2.3-5-6-8-9 


and so on. 
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The substitution of these results into formula (164) yields the 
general solution of equation (163): 


6 1 


Es NTN EUT AELA 
y =m + utga tga 173.567 T 


DAE yal A R NE Sc TA T B 
T3467" ae "°"° 
z3 zê x? 
= ot pA Ei! Elen EE 
ae EE oe 335-680 1° )+ 


z4 z1 glo 
} ! 
He (e+sat peor} 5000-70-10 ati ) 


The constants a) and a, are left undetermined and play the role 
of arbitrary constants which were previously designated by C, and 
C, in Sec. 14 (see property 5 in the section). The series which are 
taken inside the parentheses are two linearly independent particular 


solutions of equation (163). In particular, putting a =|/9 ty (3) Je 


and a = — [/3r (5). ~ we obtain the so-called Airy’s function 


of the first kind which is denoted as Ai (z). 
The above technique is always applicable to linear equations 
of the form i 


ao (2) y™ +a (E) yD +... + an (z) y =f (æ) (165) 


in case all the functions ao (z), a; (£), . . ., f (£) are polynomials in 
x or, in a more general case, when they are sums of power series 
in powers of x — x, and ao (£o) Æ 0. 

If ao (£o) = 0 the value zy is said to be a singular point of equa- 
tion (165). Then there may be no solution of form (162). In this 
case it is sometimes possible to find a solution in the form 


y = (z — a) la + b (z£ — a) + c (£ — zo)? + 
+d (@ — 2)? +...) (166) 


where the constant p should also be{chosen. When selecting a solu- 
tion of this form we can regard a as being unequal to zero because 
otherwise we can take a certain power of x — Zo outside the brackets 
and thus the operation reduces to a change of p in its value. 

26. Bessel’s Functions. We now consider an important example 
of the so-called Bessel’s equation of the form 


xy" + ay’ +(e — p*)y =90 (167) 


where p = const > 0 and 0< z< œ. Solutions of the equation 
are called Bessel’s functions although in fact they were used by 
Euler beginning with 1766, that is before Bessel was porn. The 
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functions are also called cylindrical functions because 
they are widely applied to solving equations of mathematical phy- 
sics in a domain of the form of a circular cylinder. 

Since the point z = 0 is a singular point of equation (167) its 
solution should be sought for according to formula (166) in which 
we must set rp = 0: 


y = ax? + bart! + cet? + dxets + 
J eroti + foots 4 grote + igot + jeer +... (168) 


The differentiation of (168) and the substitution into (167) result in 


x? [ap (0—1) 2-2 +b (p+ 1) pa-t +e (p +2) (p+ 1) z+.. IF 
-Ha [apz?-t 4b (p+ 1) z? +c (p+ 2) zett -+ a) ia 
+ a? (ax? + bret! + ertt +.. .)— p? (ax? + baht! 4 eret? +. 0) 


After the coefficients in the same powers of z are equalled we 
obtain an infinite number of equalities which are then solved in 


succession: | 
ap (op —1)+ap—a 2—0 yields a(p?— p?)=0, 
b(p +1) p+b(9+1)— bp? =0 yields b(p?-++ 2p +1—p%) =0, 

c(o-+2)(p+1)+¢(p +2) +a—cp?=0 yields 
c (pe? + 49 +4—p*) +a=0, 

d(p-+3)(p-+2)+4(e+3)+b—dp?=0 yields 
d(@+6p+9— p) +b=0, 

e(0-+4) (0+3) +e (+4) +c—ep=0 yields 
e (p?-+ 8p + 16— p’) +c =0 


and so on. Since a0 we see that, by the first equality, we 
have pọ? = p?, i.e. p= pP. Substituting the result into the remai- 


ning equalities we receive, In succession, 


ag i 

b=0, E pau o AOFI] d=0, 
pee RAAE EA RR if 

e= -ppe merer” I=” 

(altar — 5 i=0, 
20.2.3 (04-1) (P +2) (0+3) 


a 


i= Erse GFF 
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From this, on the basis of formula (168), we find a solution 
of the form 


a a 1 
= stare OTe Ea 1 ASS ea. a 
y= — Rory +E DOT * 
a 
ST eG, 169 
wP PFI ~ as 
where a is an arbitrary constant. Here it is convenient to put 


=——! (see Sec. XIV.17). By formula (XIV.66), we have 
PT (e+ 1) 


P(p+1)(e+4)=T(e+2), T(e+1)(9+1)(e+2)= 
=T (p+ 2) (+2)=T (p +3) etc. 
and therefore formula (169) implies, for the above a, that 


iz ors ($) -merr (4) "+ 


D S (—1)r xr\pr2n 9, 
=D arr (F) at 
n=0 Į 
This sum is called Bessel’s function of the first kind of order p 
and is denoted as J, (x). Since p = +p the general solution of equa- 
tion (167) (see property 5 in Sec. 14) can be written as 


y = CS p (2) + CoJ_p (2) (174) 

Formula of the general solution {(171) does not apply for integer 

p = 0, 1, 2, 3, .... Actually, for such p and p = —p we have 
T(—p+1)=P(—p+2)=...=P(—p+p)=+™ 


and therefore formula (170) results in 


yy (=i -p+2n 
isl) ETEA (ae: og 


n=p 


pus < (—1)P(—1)" p+2n’ 
= Derren (A) =(P (P=0, 1,2) 
n’=0 


(we have made the substitution n — p = n’). Consequently. in 
this case the solutions J, (z) and J_, (x) are linearly dependent 
(see Sec. 14), and formula (171) does not therefore express the general 
solution (although it yields a family of particular solutions depen- 
ding on one essential parameter). 
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To obtain a formula for the general solution of equation (167) 
valid for all p we can apply the operation similar to the one used 
in Sec. 17 (see case 3). Namely, we first suppose that p is non-integral, 
and form the function 

a 4 x cos pxJ p (z)—J-p (7) 
Y p(z) = cot pad» (z)—Sa Gg r Onpa 


The last function being a linear combination of the solutions, we 
obtain a solution of equation (167) which is called Bessel’s function 
of the second kind of order p. It is also sometimes designated as 
N, (z). Now, if p becomes an integer, there appears an indetermi- 
nate form on the right-hand side (why?). It can be calculated accor- 
ding to L’Hospital’s rule but we shall not put down the calculations 
here. We only note that the result will be a sum in which the expres- 
sion 


(o—1)12” y 2 
D e (for p=1, 2, 3,..-) or = Inz 
(for p=0) 
will be the principal term as z — 0. 
Thus, the formula 
y = CJ p (2) + CY p (2) (172) 


represents the general solution of equation (167) for all p > 0 (both 
integral and non-integral) in the interval 0< z < œ. The value 
J, (+0) is finite here whereas Yp (+0) = —oo. Therefore if the 
conditions of a problem imply that y (+0) should be finite we must 
retain only the first summand on the right-hand side of formula (172). 

Bessel’s functions have been thoroughly studied, and there exist 
extensive tables for the functions. The functions 


Ead x z 
1a Toe + oes — sat 


are the most important for applications. The graphs of these functions 
are approximately represented in Fig. 299. These functions and 
all Bessel’s functions of the first and of the second kind change their 
signs infinitely many times and tend to zero as z increases. From 
formulas (173) we can easily deduce the relationship Jó (z) = —J, (x) 
(check it up!). There are some other relationships between Bessel’s 
functions. All these properties can be found in [29]. 

27. Small Parameter Method. This method was described in Sec. 
V.5. It can also be applied to solving differential equations. Here 
we present some simple examples. 


a zt x 
Jo(2)=1—Gyaae + eye ee (173) 
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The problem 


, 


z 
Y= ESAE y (0)=0 (174) 


does not involve any parameters. But we can take a more general 
problem of the form 
E y (0)=0 (175) 


where œ is a small parameter. Problem (174) is a particular case 
of (175) for « = 0.1. Problem (475) can be easily solved for æ = 0. 


Fig. 299 


Evidently, in this case we have y == . Therefore we seek the 


solution of problem (175) as an expansion in powers of œ, that is in 
the form 


2 
y= + au + ov} dw + Mis (176) 


where the functions u = u (z), v = v (x) etc. depending on x are 
yet unknown. 


Substituting (176) into (175) and multiplying by the denominator 
we obtain 


(e+ au’ + av! Haw +...) (1 +5 284+ ou arv + iv) =a 


The initial condition implies 


au (0) -+a%v(0)+...=0 
and therefore we have 


u (0) =0, v(0)=0, w(0)=0,... (178) 
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Removing the parentheses in (177) and equating the coefficients in 
power of œ to zero we get, in succession, 


4 MEE > 
uw +57 =0, v+u teu, 
, Nr , 2 
w +g} truu + av =0 


and so on, Now taking into account equalities (178) we find 
x 7 71 


Patter, tx) ea PENEN 
A E E AAEE A | A700. 


etc. (check up the calculations!). 
Consequently, formula (176) yields 
Soe Gi gpI OR ges AOR AA 
Y= ane hat m 
In particular, putting ~=0.1, we obtain the expression 


a2 728 74x11 


z5 
y= 100 + 16,00 1,760,000 


for the solution of equation (174). This series perfectly converges 
for | x | < 1, the convergence being a little slower for 1 < | £ |< 2. 
As another example, let us take the problem 


y' = sin (zy), y (0) =@ (179) 
Here, in |contrast to the previous problem, there is a parameter 


entering into the initial condition. Problem (179) has an apparent 
solution for « = 0, namely the solution y = 0. Therefore we look 


for the solution of the form 

y=outov+orw --- (u=u(z), v=v(z), ---) (180) 
for small |œ |. The substitution of the value z = 0 yields 

u(0)=1, v(0) = 0, w (0) = 05.2 (181) 
On the other hand, substituting (180) into differential equation 
(179) and taking into account the power series for the sine [see for- 
mula (IV.57)] we receive 
au! taw Hew +... = 


_(weutotantataw+..:) _ (aaucpalavt oPewt oP 
z i 5 Je 


Equating the coefficients of the same powers of œ we get 
, , , z3u3 
za pol = 20s |W BD — az 
Hence, here we have arrived at linear equations which must be 
solved in succession. Integrating the equations with initial con- 
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The problem 
r T eati 4 
Y= poy?  ¥O)=0 (174) 


does not involve any parameters. But we can take a more general 
problem of the form 


Ae rare 9 (0)=0 (175) 


where œ is a small parameter. Problem (174) is a particular case 
of (175) for œ = 0.1. Problem (4175) can be easily solved for a = 0 


Fig. 299 


Evidently, in this case we have y = . Therefore we seek the 


solution of problem (175) as an expansion in powers of œ, that is in 
the form 


y= pautan owt... (176) 


where the functions u = u (z), v = v (z) etc. depending on x are 
yet unknown. 
Substituting (176) into (175) and multiplying by the denominator 
we obtain 
(z--+ au! taw Haw +...) (1 +5 Haru Haw F) =x 
(177) 
The initial condition implies 


au (0) +a% (0) +...=0 
and therefore we have 


u(0)=0, v(0)=0, w(0)=0,... (178) 
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Removing the parentheses in (177) and equating the coefficients in 
power of œ to zero we get, in succession, 


4 «3 
u' +3 at—0, v +u'+au=0, 


3 
w -pio + aun’ + av =0 


and so on. Now taking into account equalities (178) we find 


a 7 71 


seh hg SB IA 
or? Y= 77607 


etc. (check up the calculations!). 
Consequently, formula (176) yields 
In particular, putting a—0.1, we obtain the expression 
x2? x5 728 74x11 
Y= 00 + 16,00 ~ 1,760,000 © 


for the solution of equation (474). This series perfectly converges 
for | z | < 4, the convergence being a little slower for fet | ot 2. 

As another example, let us take the problem 
y' = sin (zy), Y (0) =a (179) 


Here, in |contrast to the previous problem, there is a parameter 
entering into the initial condition. Problem (179) has an apparent 
solution for a = 0, namely the solution y = 0. Therefore we look 
for the solution of the form 


y=au+orv-+ow --- (u=u (2), =v (a), ---) (180) 

for small | a |. The substitution of the value z = 0 yields 
u (0) = 4, v (0) = 0, w (0) = 0, ... (181) 
On the other hand, substituting (180) into differential equation 


(179) and taking into account the power series for the sine [see for- 
mula (IV.57)] we receive 


aw! + ov! aw! + + = 


ies 


_ (aru+ azv parwt..) (aru+ arv +atzwt TAK a 
= 1 3! 
Equating the coefficients of the same powers of a we get 


? 5 4 x3u3 
u'= zu, v=, w =w — -z n 


Hence, here we have arrived at linear equations which must be 
solved in succession. Integrating the equations with initial con- 
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ditions (181) we find 


igs: 1 3y 1 ful 
ek aie ae ia eo r? 2 DE gh EA 
user. v=0) w= z x*)e Ze aes 


(check up the results!). Substituting the above expressions into 
(180) we obtain the sought-for solution in the form of an expansion 
which is valid for small | x | and | @ |. 

In more complicated problems involving the small parameter 
method it often turns out that even the determination of the first 
term containing the parameter may yield fruitful results. 

28. General Remarks on Dependence of Solutions on Parameters. 
Here we give some general considerations related to the problems ` 
discussed in the preceding section. There are many cases when a 
differential equation in question or a system of such equations 
contains one or more parameters which can take on different con- 
stant values. For simplicity’s sake, let us consider a first-order 
equation of the form 


f(x, y, A) (182) 


where À is a parameter. Suppose we have certain initial conditions 
T= To, Y = Yo. 

Let us suppose that the point (£o, yo) is not a singular point (see 
Sec. 7). This means that there is a single solution of equation (182) 
satisfying the initial conditions. Then the geometric meaning of 
equation (182) (see Sec. 3) implies that if its right-hand side is 
continuous in À the change of the direction field corresponding to 
small variations of A will also be small. Therefore the solution 
y (2, à) will also be continuous in A. We can likewise conclude that 
the situation will be the same if not only the right-hand side of 
the equation continuously depends on A but the initial conditions 
as well, that is if xo = zo (A) and yo = yo (À). ` 

Suppose that the solution y (z, A) of equation (182) is known for 
a certain value ^ (the “non-perturbed” value of the parameter, as 
it is called). Now, let the value of the parameter change and become 
equal to A + Ad where | AA] is small. Then y will also change and 
gain an increment A,y. The differential of A,y, that is its principal 
linear part, will be called a variation of the solution and will be 
designated as ôy. 

Thus, a variation is a particular differential of a solution corres- 
ponding to an increment of the parameter. The new notation has 
been introduced to distinguish between the differentials correspon- 
ding to the argument z and to the parameter A. When it is permissible 
to neglect infinitesimals of higher order of smallness we can simply 
say that the variation of a solution is an infinitesimal change of the 
solution due to an infinitesimal change of the parameter. With a 
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value of À chosen, the quantity dy (as well as y) depends on x. Hence 
we can write ôy = ôy (x) where ôy (x) also depends on Ad and is 
directly proportional to it (although the dependence on Ad is not 
indicated explicitly here). 

To form a differential equation. for dy we must equate the diffe- 
rentials (taken with respect to A) of both sides of equality (182): 


dy 
6H 8 (f (x2, y, n= ayti 82 (84 = Ad) 


that is 


LN f(x, y, A Oy + fh (@ ys A) ÔA (183) 
In deducing (183) we have interchanged the signs d and ô because 
these are the signs of the differentials taken with respect to diffe- 
rent variables (see Sec. IX.15). We have also applied the formula 
for differentiation of a composite function. Equation (183) is called 
the variational equation corresponding to original equation (182). 
Since the “non-perturbed”. solution y (x, 4) is substituted for y 
in the right-hand side of (183) the equation is linear in ôy and can 
be easily integrated (see Sec. 4). Variational equations which cor- 
respond to higher-order equations or to systems of equations are 
not integrable by quadratures in the general case but they are 
always linear. 

Let us establish the initial condition for ôy. In the general case, 
when x) = zo (A) and yo = Yo (A), we substitute A + 6A for A and 
neglect infinitesimals of higher order which implies that we have 
Y = yo (à + ÔA) = yo Hy for x = To (A + ôN) = Zo + 2,6 
where zi = eat and yj ti Hence, for the value A + 6A 
of the parameter and for z = Zo, We have 


y |x=xo =F | x=xo+x000 Oxy lase=ab6n 7 
= yo + ygbr— SL x18 = yo + yh — fori 


where fo =f (zo, Yo, A). But the same value of y is equal to 
(y (x, A) + dy) lea, = Yo + (y)kx—=x,- Therefore, the initial value 


of by is expressed by the following initial condition: 

(5y) |x=xy = (Yo — fot) 5% 
In the particular case of values £o and yo independent of 4 we have 
x = yl —0 and therefore the initial condition has the form 
(öy) le =x, = 0. 


There are some cases when a parameter enters into a differential 
equation in such a way that the order of the equation reduces, that 
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is the equation degenerates, for certain values of the parameter. 
Such a situation is connected with new phenomena which we shall 
illustrate by taking an example. 

Consider the problem 


dy’ +y=9, Y|x=m—4 (184) 


x 

whose solution is y =e *. The value 4 = 0 yields the degeneration 
(why?). Let the solution be considered for z > 0 and let A —> +0; 
the solution is depicted in Fig. 300. The limiting case for equation 
(184), as 4 +0, is the equation y = 0. This is a finite equation 
whose solution y = 0 does not satisfy the initial condition y|:—x = 
= 1. Besides, we see that when A is close to zero (but unequal to 
zero) the solution is close to zero for 
nui, the values of z which are not too close 
/ to z = 0 (for instance, for the values 
of z exceeding the value h shown in 
Fig. 300), this not being the case for 
the values of x which are close to 
x =O (for instance, for the values 
of x lying in the interval 0 < z < h). 
An interval of the type O<2<h 
Fig. 300 (which we have in our particular 
problem) is called a boundary layer. 
The solution of (184) must cross the 
boundary layer 0< £< h in order to pass from the unit initial 

value (184) to a value which is close to zero. 

The width of a boundary layer is understood conditionally. In 
fact, in our example the solution never turns into zero. If we con- 
ditionally take a certain value of x as the width of the boundary 
layer, for instance, x = k for which the magnitude of the solution 
decreases 10 times in comparison with the initial magnitude, then, 
in the case of problem (184), we obtain 


h 
e*=0.1 and h=I1n10-A 


that is the width of the b a is di ional 
by ie uate ces, oundary layer is directly proportional 
_ If 4» —0 we obtain the solution which is shown in the dotted 
line in Fig. 300. This solution tends to infinity for any z > 0 as 
4 —»- —0. This case is not as interesting as the preceding one. 

An analogous phenomenon is often encountered in more complica- 
ted cases. For instance, let the solution of a second-order equation 
satisfying two initial conditions or boundary conditions (see Sec. 16) 
be considered and let the order of the equation be reduced by unity 
for a certain value A = Ao. Then the solution yo (z) corresponding 
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to A = No satisfies a first-order equation. Therefore it often happens 
that if the solution is finite for A = Ao it satisfies only one of the 
initial conditions and may not satisfy the other. The solution y (z) 
has a boundary layer for the values of A close to Ao (with the width 
proportional to |A — Ao |) which is transversed by the solution 
when it passes from the other condition to yo (x). A similar situation 
may also occur for systems of differential equations. 

29. Methods of Minimizing Discrepancy. The methods are based 
on the idea that an unknown function is sought in a form containing 
several parameters, that is in the form 


yY = @ (a, Ags Aos -> Am) (185) 


The right-hand side is usually chosen in such a way that the initial 
or boundary conditions imposed on a solution should be satisfied 
for any values of the parameters. After expression (185) has been 
substituted into a given differential equation we obtain the corres- 
ponding discrepancy, that is the difference between the right-hand 
and left-hand sides. The discrepancy (which we denote by h) depends 
upon the parameters: 


bt 2 iG da hee EEN) 


If solution (185) were exact the discrepancy h would equal zero 
identically. Therefore we can impose m additional conditions that 
should be satisfied by the discrepancy and that are automatically 
fulfilled for the function which is identically equal to zero in order 
to determine the necessary values of the parameters Pit) Naaa anata 
For instance, we can equate hk to zero for m different values of. T; 
this is the collocation method. We can try to minimize the integral 
b 


\ A? dx on the interval a <z < b in which the solution is being 


a 
constructed; this is the method of least squares. We can equate to 
b b 


zero m integrals of the form § mp (x) dz, § hpal(z) dz, .. . 


" {hm (a) da where p(z), Yo (2) ++ +> Pm (x) is a chosen system. 


of functions; this is the method of moments (integrals of this type 
are called moments). 

The greater the number of the parameters we introduce, the more 
“flexible” formula (185) (that is the greater the accuracy of the repre- 
sentation of a sought-for solution which can be attained by means 
of the formula). But at the same time every increase of the number 
of the parameters leads to more and still more complicated calcu- 
lations. It is an art to be able to forecast the form of the sought-for 
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solution by using a formula containing a few parameters. The correct- 
ness of the result can be judged by comparing the results of repeated 
calculations performed according to different methods or with the 
help of different numbers of parameters etc. 

We can also take a right-hand side of formula (185) which iden- 
tically satisfies only a number of initial or boundary conditions 
imposed on the solution. Then the corresponding number of con- 
ditions chosen for determining the values of the parameters must 
imply that the remaining conditions should be satisfied. It is appa- 
rent that in such a case the number of additional conditions that 
can be chosen more or less arbitrarily is respectively reduced. 

Let us take an example in which it is possible to compare appro- 
ximate solutions with the exact solution. Let it be necessary to 
solve the problem 


yo ty=0 O<ae<1), yO =9% yt) =1 
Let us look for the solution of the form 


Here the first boundary condition is satisfied automatically whereas 
the second implies A- p = 1. From this we obtain y = Ae + 
+ (1 — 4) z? and hence we have only one degree of freedom at our 
disposal, that is only one additional condition that we can choose 
to minimize the discrepancy which has the form 


h=yt+y=2(4—A)+4r+ (1 O A) a? 


in our case. The collocation method applied to x => yields the 


9 
value \ = 7. The method of least squares used for the interval 


0<a<1 yields the value e . Finally, the method of mo- 


ments with the function wp (z) = 1 yields the value 4 = (let the 


reader verify all the calculations!). Substituting these values of A 
into (186) we obtain the corresponding. expressions which approxi- 
mate the exact solution y = m 
exact solution is equal to 0.5699 for z = 0.5 whereas the approxima- 
tions yield, respectively, the values 0.5714, 0.5681 and 0.5682. 
Hence the error is about +0.3 per cent. 

30. Simplification Method. This method is widely used in prac- 
tical calculations especially when we want to get a crude estimation 
of the result. The method includes such techniques as simplification 
of the original equation by dropping terms that are comparatively 
small, replacing slowly varying coefficients by constants and the 


fairly well. For example, the 
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like. After this procedure is carried out we can arrive at an equation 
of one of the integrable types. Then integrating the equation we 
obtain a function which can be regarded as an approximate solution 
of the original equation. At any rate, such an approximation often 
correctly describes the qualitative character of the behaviour of the 
exact solution. After this “zero approximation” has been found we 
can often use it for determining corrections which compensate the 
simplifications we have performed, and thus we can find a “first 
approximation” and so on, 

In case an equation involves parameters (e.g. masses, linear sizes 
of the objects under consideration etc.) we should take into account 
that the terms which we regard as being small can be different for 
different values of the parameters and therefore, generally, the 
simplification of an equation must be performed in different ways 
for differ ent values of the parameters involved. Besides, it is some- 
times necessary to break the interval of variation of the independent 
variable into several parts and simplify the equation on each of the 
parts by means of specific techniques pertaining to that very interval. 

It is especially useful to simplify an equation in the way described 
above if we use certain simplifications in the very process of dedu- 
cing the equation or if the degree of accuracy with which the quan- 
tities in question are determined is not sufficiently high. For in- 
stance, we should by all means drop such terms entering into an 
equation that are less than the admissible error of determination 
of its other terms. 

For example, let us consider the problem 


P 1 
y +a yh eyo y (0)=1, 


y' (0)=0, 0<z<2 


The coefficient in y changing slowly, we replace the coefficient by 
its mean value (see Sec. XIV.5) which is equal to 


(187) 


2 
1 A 1 In(1-+O.4e) J2_ In 4.2 _ 
\ ILONAA 04 oog = 


220 Toa 
0 


Besides, let us drop the third summand which is comparatively 
small. f 

Thus we get the equation y” + 0.914y = 0 whose solution satis- 
fying the initial conditions is 


y = cos 0.9542 (188) 
The form of this approximate solution justifies the procedure of 


dropping the last term in the equation because the ratio of the third 
term to the second term is of the order of 0.2y? < 0.2 and there- 


37—0141 
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fore the first term and the second term must approximately “cancel 
out”. Let us now bring in a correction connected with the last sum- 
mand. To do this we substitute approximate solution (488) into 
the summand and retain the averaged value of the coefficient in y: 


y" + 0.911y = —0.2 cos? 0.9542 = 
= —0.05 cos 2.862 — 0.15 cos 0.954 


[here we have used formula (VIII.14)]. Now according to the 
methods of Sec. 18 we obtain the solution 


y = 0.993 cos 0.954 z — 0.0792 sin 0.9542 + 0.007 cos 2.862 


satisfying the initial condition (check up the calculations!). 

The difference between the last result and zero approximation 
(188) is not large and therefore our conclusions concerning the roles 
of different summands entering into equation (187) are confirmed 
again. At the same time we see that the third term of equation (187) 
has introduced a certain correction into the solution. [Think how 
we can determine the correction that takes into account the variabi- 
lity of the coefficient in y entering into equation (487).] 

Considerations of the above type are often not sufficiently rigo- 
rous and may lead to mistakes. Therefore they should be applied in 
accordance with common sense. Then, comparatively often, they 
nevertheless result in approximate solutions that can be used for 
practical purposes. 

31. Euler’s Method. Now we proceed to study some methods of 
numerical integration of differential equations. These methods are 
particularly applicable to the cases when none of the above methods 
of “approximate analytical integration”, that is methods of con- 
structing approximate formulas for solutions, turns out to be ineflec- 
tive, especially if a solution must be calculated with a great accuracy 
for a large interval of variation of the argument. Besides, the methods 
are used when equations are solved by means of electronic computers. 

It is often advisable to combine methods of approximate and 
numerical integration. For instance, if an initial condition for the 
equation 


y+dte*yy= 


is given we can apply Taylor’s formula for small values of x (see 
Sec. 24), one of the methods of numerical integration for intermediate 
values of z and, finally, we can simply drop the term e~* for large 
values of zv. 

We shall consider four methods of numerical solution of first- 
order differential equations which are most frequently used. The 
methods are easily extended to systems of first-order equations to 
which we can also reduce higher-order equations. In courses on 
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approximate calculations one can find some other methods. We 
especially recommend [3], [9], [44] and [34]. 

Euler’s method is visual and simple but not sufficiently effective. 
The reader must understand it well because many important and 
effective methods used in different branches of mathematics are 
essentially the development of the Euler method. 

Euler’s method is based on the procedure of direct replacement 
of the derivative entering into a differential equation by the ratio 
of finite differences (difference quotient) which were considered in 
Sec. V.7. Suppose we have an initial-value problem of the form 


y' = f (z, y) Yy (zo) = Yo (189) 


For simplicity’s sake, let us take a constant step h along the z-axis. 
We introduce the notation 


zo +h = tn t + 2h = Ta, to + 3h = Tz, . 


Approximate values y (z,) will be designated as yx. To find the 
values we replace the derivative in the equation by the difference 
quotient. Hence we have 


Ay 
Fa =F (En, yn) 
i.e. 
Zay 
-a k L (a yn) 
and hence 
Yrsa = Yr + Í (Em Yr) h (190) 
Beginning with yo and making k assume, in succession, the values 
k=0, 4, 2, ..., we apply formula (490) and compute the values 


y = Yo + f (2o Yo) hs Ya = uw the Ya) hy os 


Euler’s method has a simple geometric meaning which is illustra- 
ted in Fig. 301 where the integral curves are also depicted. We see 
that the geometric significance of the method lies in the fact that 
we draw the line segment MoM, tangent to the desired integral 
curve through the point Mo instead of the integral curve itself which 
is unknown. In doing so we follow the direction of the direction 
field at the point Mo. We likewise draw the corresponding line 
segment through the point M, according to the direction of the 
tangent prescribed by the direction field at M, and so on. Thus we 
obtain Euler’s broken line (polygonal line) which approximately 
represents the integral curve that would appear if the step k were 
infinitesimal, that is if we continuously “corrected” the direction 
of the broken line. 

37* 
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The order of the error of Euler’s method can be easily estimated. 
Indeed, taking advantage of formula (190), we can replace the in- 
crement of the solution by its differential yp, Av =f (tn, Yr) h 
This leads to an error of the order of h? [see formula (IV.49)]. When 
we construct the solution on an interval (£o, 2) and break the inter- 


*—*0 and therefore the resultant 


val into n parts we have h = 


error will be of the order of nh? pus araeo Hence, to decrease the 


error 10 times, that is to determine one more decimal digit, we must 


A Myst 


Fig. 304 Fig. 302 


also increase the number of points of division 10 times which leads 
to a considerable increase of the amount of calculations. Here lies 
the disadvantage of the method. 

There is a specific feature of Euler’s method which is also charac- 
teristic of other methods of numerical integration of differential 
equations. We have already noted (see Sec. 7) that a solution of a 
differential equation may approach infinity for a finite value of <z 
when it is continued along the z-axis. But at the same time it is 
clear that an approximate solution constructed in accordance with 
Euler’s method remains finite for all values of x. To describe the 
behaviour of the solution in such a case correctly we can apply the 
following technique: if we see that there is a considerable increase 
of the solution in its absolute value we can perform the substitution 


et "i 3 ‘ 
Yi ==) an the differential equation. Then if further integration 


shows that z passes through a zero value attained at a certain point 
xz =a, this means that |y (a) | = o. 

32. Runge-Kutta Method. We now demonstrate a simpler variant 
of this method which specifies Euler’s method. Suppose that 
an approximate value yp of a solution for z = z, has already been 
computed. Then we can find y;,, by calculating in accordance with 
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the formula 
h o fah 
h-il nh m=i lat ntr) 


Yrs = Yr t Arh (191) 


The geometric meaning of these calculations is illustrated in 
Fig. 302. Namely, every subsequent segment M,Mn.4 of the broken 
line approximating the integral curve is constructed in the follow- 
ing way. We first draw the line segment MyM rers according to the 
slope f, of the direction field at the point Mr, just as in Euler’s 
method. But here we do not limit ourselves to the result thus ob- 
tained and proceed to determine the slope a of the field at the mid- 
point N, of the segment and to draw the new segment Mi Miss 
in this direction. Thus, in this way we specify the slopes of the 
segments of the broken line approximating the integral curve. 

Even the geometric illustration shows that this method is more 
accurate than Euler’s method because here we take into account 
the change of the slope of the field along the interval a, <2 A 

This can also be confirmed by. calculations. By Taylor’s formula 
(see Sec. XII.6) we get 


frh 
EE eee PO) 


where O (h?) designates a quantity which is bounded in comparison 
with h? (compare with Sec. IlI.11). As in Sec. 24, we find 


yh=f (any Ya) = fn YR fe (tr Yn) H fy (£r, Yr) Yh» 
2 
Ynya = yr Hark = yr t Í (Em yn) h+ fa (ths Yn) oy 


2 7 Yh 
4: fh (ams Un fae +O) = ack ah tg +O W) 
(192) 
But the exact. value of the solution satisfying the condition 
y (£n) = yx is equal to 


y (en th) = yet kh bye FP +0 W) (193) 


[see Taylor’s formula (IV.50)]. Comparing formulas (192) and (193) 
we see that the values y (£a + R) and yp,1 can differ only in terms 
whose order of smallness is not less than that of hè. From this, as 
in the end of Sec. 34, we conclude that the resultant error is of the 
order of o or, which is the same, of the order ofh?. Hence, if we 
increase the number of points of division 10 times the degree of 
accuracy will increase 400 times. 
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A still more precise result will be obtained if we calculate accor- 
ding to the following scheme: 


h Aah 
fa = f (2r, Yr), an= f (an+ z! Yr t-z i 


h 
Be=f(ax+ $s ets), m= Alert, yaah), 


Yrs = Y+ (fa + 2an + 2Br+ Va) h 


Calculations similar to (192) show that here the error made at ever 
step does not exceed a quantity of the order of kë and therefore the 
resultant error is of the order of h*. Consequently, if we increase 
the number of points of division 10 times the degree of accuracy 
will increase 10,000 times. 

33. Adams Method. This method was introduced in 1883 by the 
English astronomer J. C. Adams (1819-1892). The method is based 
on Newton’s second interpolation formula (V.29) which is applied 
to the derivative y’ (z) of the solution beginning with a certain 
value z, = 2) + kh of the argument: 


X ; H Ayp T—Th T—Th 
y' (2)= yk + Ayk_1 = 7+ + ( ae 1) F 


h h 
Ayh g I—<p 2— Th T— Th 
N Ae 2) 9H 
In applying the formula we have substituted Lp — T = — (x — gy) 


for t = a4, — z and, respectively, yp for the value yp,,. Besides, 
we have replaced the sign of approximate equality by the sign of 
exact equality although formula (194) is, of course, approximate, 
and its error is of the order of Aty’, that is of the order of hê (see 
Sec. V.7). Integrating formula (194) from zp to Erys = xp +h 


(with the help of the substitution — = s) we receive 


, 1 , 5 , 3 r 
Yr = Yr t (u Tog Avki ay Who +3 Ayha) h (195) 
(check up the calculations!). 


The error of formula (195) is the result of the integration of the 
error of formula (194) and therefore it is of the order of hi (why is 
it so?). 

Formula ( 
the values 


Yi = y (oth), Ya =y (to + 2h) and ys = y (zo + 3h) 


by means of some other method, for instance, by Taylor’s formula 
(see Sec. 24) or by the Runge-Kutta method (see Sec. 32). Then we 


195) is utilized in the following way. First we find 
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compute the corresponding values 
ys =F (to Yor Yi =F» Y) Y, = Í (Ta Yo) 
and y, =f (£z Ys) 
This enables us to determine 
Ayi = yi Yo AY AY Aty, = Ayi — Ayo A°y, 
Ady, = A?y1 — AY 


Further, putting k = 3 in formula (195) we calculate y4, and then 
using this value we fnd y, = f (ter ya), Als = y, — Y, Ay, and 
Ayi. Then putting k = Z4 in formula (195) we calculate ys. After 
that we use ys for finding y, =f (z; Ys) ete. The calculations are 
performed according to the following scheme: 


x y  yetay dy Ay’ ay 


Xk-3 Vis Vis 


I 
Xk-z Yk Vk-2 
Xk-t Yk Vk-s 2 
po---2--- -------7--⁄'1 
Xk 
Xess 


34. Milne’s Method. This method can be obtained by means of 
Newton’s first interpolation formula (V.27). It is one of the most 
effective methods. Here we give only the final result, that is the 
scheme for performing calculations implied by the method (which 
was introduced in 1926). 

The calculations are performed according to the following for- 
mulas: 


Yrsa =at (2yk-2 — Yk-1 + 2yh) 
(k=3, 4:5, ---) (19) 


Yht = Í (tres Yrs) 
h ' bea! 
Yrs = yrat y (Yh-1 a+ Yht) 


where y; = f (i, yi). Here, as in the Adams method, the values 
Yor Yı» Yor Ya Should be found in advance by means of some other 
method. After the values have been found we put k = 3 in formula 


(196) and calculate y,: Yp Yor iD succession. Then, putting k = 4 
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we find y;, yi, ys and so on. The values y,, y;, ye, . - - thus deter- 
mined are the approximate values of the solution y (x) for z = x, 
Zs, Lg, -.- Where 2; = Zo + ih. 


It turns out that the absolute error which occurs when we calculate 
Yr+, according to this method is approximately equal to 


(Yri — Irl 
29 


Therefore when performing the calculations we can simultaneously 
check whether the error lies within the limits of the degree of accu- 
racy we have chosen for our calculations. If we see that at a certain 
stage of our calculations the error falls outside the prescribed limits 
we must decrease (from the corresponding value of z onwards) ti 
step % taking into account that the resultant error of the methol 
‘is of the order of ht. 


CHAPTER XVI 


Multiple Integrals 


§ 1. Definition and Basic Properties of Multiple Integrals 


{. Some Examples Leading to the Notion of a Multiple Integral. 
We now consider a solid (Q) with the density of mass distribution p. 
The density p can be variable, that is different at different points 
of the solid. Let the function p = p (M) (where M is a point of Q) 
be known and let it be necessary to determine the whole mass m 
of the solid. An analogous problem 
for the case of a linear mass distribu- 
tion was solved in Secs. XIV.1, 2. 
Let the reader read these sections 
again before proceeding to study 
multiple integrals. 

The spatial case is treated quite si- 
milarly. Let us mentally divide the 
region (Q) into n parts (subregions) 


(AQ), © (AGA) eG yee as ae 
Fig. 303. Let the symbols (AQ) (k = Be aon 
=1,..., n) designate the parts S 

themselves, and let AQ, designate 

their volumes, Now, choose an arbitrary point Mr Abe INT) 
in each of the subregions (AQ;). The points M,, M,,..., Mn are 


also shown in Fig. 303 which represents a spatial picture. If the 
parts (AQ) are sufficiently small we can regard the density as 
being constant within each of the parts without an essential error. 
Then the mass mag of the first part (AQ,) can be computed as 
the product of the density by the volume, i.e. as p (M,) AQ,. 
The mass Magy of the second part is found similarly and so on. 


Thus, we obtain 


may & p (M4) AQ, +p (M2) AQ, + . .. #0 (Mn) AQ, = 


= oH p (M) AQ; 
k=1 
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This is an approximate equality since the densities of the parts 
are nevertheless variable. But the smaller the parts, the greater 
the accuracy. Hence, passing to the limit, as AQ, > 0 (a ae 

.. n), we obtain the exact equality 


n 


mia) = lim 2 p (Mr) AQ: (1) 


The limit is taken here in a process in which not only the volumes 
but also all the linear sizes of the parts of the partitions tend to 
zero. Besides, it is supposed that the limit does not depend on the 
way of partitioning (Q) into subregions. 

Reasoning in a similar way we can conclude that if an electric 
charge is distributed over a solid (Q) with density o the magnitude 
q of the charge is found by means of the formula 


n 


q= lim È o (Mr) AQ; (2) 


where the notation is understood as before. 

A mass or a charge can be distributed not only in a volume but 
also over a surface or a curve. When we say that a mass or a charge 
is distributed over a surface this, of course, means that one of the 
dimensions of the domain of space in which the quantity is distri- 
buted is very small relative to the other two dimensions. The distri- 
bution along a curve is understood similarly. Formulas (1) and (2) 
remain true in the latter cases if the density p (or o) is understood 
as a surface (areal) density (i.e. mass or charge per unit area) or as 
a linear density (i.e. related to unit length) and AQ, designates 
the area or the length of the part (AQ,) (k = 1, ..., n), respecti- 
vely. In the general case we call AQ, the measure of the subregion 
(AQ,) understanding it as volume, area or length depending on 
whether we consider spatial regions, surfaces or curves. 

2. Definition of a Multiple Integral. The similarity between for- 
mulas (1) and (2) indicates the advisability of the general definition 
of a multiple integral given below. For definiteness, let us consider 
integrals over three-dimensional regions. The measure of such 
a region is understood as its volume. 

Suppose we are given a bounded (finite) region (Q) in space. Let 
a function u = f (M) be defined over (Q) and let the value f (M) 
of the function be finite at each point M of the region. To compose 
an integral sum we arbitrarily break up the region (Q) into subre- 
gions (AQ,), (AQ,), ..., (AQ,) and take an arbitrary point 
M, (k=1, ..., n) in each of them. Then taking the values f (M), 
f (M), .--, f (Mn) of the function f assumed at the points M4, 
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M>, .. +» Mn we write down the integral sum 
2 UR AQ), = 2 Í (Mr) AQ, (3) 
where AQ, (k =14, ..., n) designates, as before, the volume of 


the subregion (AQ,). 

The limit of the integral sum taken in a process in which all the 
linear sizes of the subregions entering into the partitions of the 
region (Q) are unlimitedly decreased is called the integral of the 
function f over the region (Q). Denoting the integral by the symbol 


u dQ we can write 
(Q) 


f udQ= | f (M) dQ= lim >) f (Mr) A& (4) 
È) È) k=1 


(compare this with the basic definitions given is Secs. XIV.2 and 
XIV.22). (Q) is called the region (domain) of integration. 
Consequently, formulas (1) and (2) can be rewritten as 


m= f pag and q= | oag 
(2) (2) 

As in Sec. XIV.2, we can integrate both continuous and discon- 
tinuous functions. The existence of limit (4) can be proved for any 
finite function (under some additional conditions) defined in a finite 
region without referring to the physical meaning of an integral 
(by means of purely mathematical considerations). Besides, it is 
not the boundedness of the region that is essential for such a proof 
but the boundedness of its measure. Fig. 264 represents an example 
of a region which extends to infinity but has a finite measure. 

The definition of an integral taken over a surface (which can be 
plane or curvilinear) or along a curve is formulated quite similarly. 
In these definitions we must, of course, take the areas or the lengths 
of the subregions instead of volumes when forming a partition of 
the region. In particular, an integral of this type along a curve is 
nothing but a line integral of the first type taken with respect to 
arc length which was studied in Sec. XIV.22. Integrals over a volume 
and over a surface are referred to as multiple integrals. The former 
are also called triple integrals and the latter are called double in- 
tegrals. These terms will be explained in § 3. 

3. Basic Properties of Multiple Integrals. The basic properties 
of a definite integral proved in Secs. XIV.4, 5 are implied by the 
definition of an integral as the limit of the integral sum (see 
Sec. XIV.2). Therefore we can easily extend these properties to multiple 
integrals. We enumerate them here. 
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1. The integral ofa sum equals the sum of the integrals of the 
summands (the same is true for the difference): 


\ (uy + Uy) dQ = f uy dQ + i u dQ 
(2) (@) (2) 


2. A constant factor can be taken outside the sign of integration: 


f CudQ=C \ wdQ  (C=const) 
(2) (8) 


3. The theorem on a partition of the region of integration: for 
any partition of the region (Q) into parts the integral over the whole 
region is equal to the sum of the integrals over the parts. For defi- 
niteness, if (Q) is divided into the parts (Q,) and (Q,) we have 


| uan= | udo+ f wae 
(Q) (94) (22) 


4. The integral of unity is equal to the measure of the region of 
integration: 


[ =Q 
(2) 

Ki If the region of integration degenerates, that is its measure 
turns into zero, the integral itself becomes equal to zero. 

When formulating properties 4 and 5 we speak about the measure 
of a region (see Sec. 1) understanding it as volume, area or length 
depending on whether we consider triple, double or line integrals. 

6. If the variables in question have certain dimensions then 


: [ \ u dQ | = [u]: 19] 
(2) 


_ 1, The case of symmetry. If the domain of integration can be divided 
into two symmetric parts and if the integrand takes equal values 
at the corresponding points belonging to these parts, the integral 
over the whole domain is equal to the doubled integral over each 
of these parts. If the integrand is multiplied by —1 when we pass 
from any point belonging to one of the parts to the symmetric point 
in the other part the integral over the whole region is equal to zero. 

It is sometimes possible to break the domain of integration into 
a greater number of equal parts in order to reduce a given integral 
pi an integral over a domain of a simpler form than the original 
domain. 
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8. It is allowable to integrate inequalities: if u, < us then 


w adRQL \ ud (5) 
(2) (2) 

The last inequality turns into the strict equality if and only if 
u, = u, provided both functions u, and us are continuous. But 
for the case of discontinuous functions integrals (5) can nevertheless 
coincide even when the identity u, = Us is violated at points belon- 
ging to degenerated subregions which have a zero measure because 
such a violation does not affect the value of the integral (compare 
this with the corresponding property of an integral over a line 
segment). 

9. An integral satisfies the inequalities 


Umin K | u dL umah (6) 
®) i 
10. Inequalities (6) are connected with the notion of the mean 
value u of a function u over a region (Q) which is defined by means 
of the formula j 
\ udQ)= f udQ (u= const) 
®) (2) 
similar to that given in Sec. XIV.5. Thus, we have 


u=q |v and | udQ=u2 
(2) (2) 


Inequalities (6) imply that Umin S u ee AA 

All these properties can be visually illustrated if we regard the 
function u as the density of a mass distribution and the integral as 
the mass itself.. 

14. There is an inequality of the form 


baai kat 


which is similar to that given at the end of Sec. XIV.5. 

4. Methods of Applying Multiple Integrals. There are two basic 
schemes of applying multiple integrals to physical problems (compare 
with Sec. XIV.6). The first one is based on representing the quantity 
in question in an approximate form of integral sum (3) in which 
we then pass to the limit, as it was shown in Sec. 1. The second 
scheme is based on composing the “element” (differential) of the 
quantity. We now briefly discuss the latter scheme (we shall dwell 
in more detail on the scheme in § 2). 
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Suppose we are interested in a quantity g which corresponds, for 
definiteness, to a spatial domain (Q) [the situation is similar to 
that considered in Sec. 1 where we had a mass m or a charge q dis- 
tributed over a three-dimensional region (Q)]. Let us form the expres» 
sion dg = @ (M) dQ which approximately describes an infinitesimal 
portion of the quantity q corresponding to an infinitesimal volume 
dQ placed at an arbitrary point M. This expression possesses the 
following properties: (1) it is directly proportional to the volume 
and (2) it differs from the true value Aq of the quantity q correspon- 
ding to the portion dQ in an infinitesimal term of higher order of 
smallness relative to Ag. Now, summing 
all the quantities dg over all the “ele- 
ments of volume” dQ within the domain 
(Q) we obtain 


g=qa= | pM) (7) 
(2) 

As an example of the first method, let 
us consider the expression of the static 
: . moment (moment of mass) of a mate- 

Fig. 304 rial body with respect to a plane (P). As 

is known from mechanics, the static 

moment of a finite system of material points relative to a plane (P) 
is expressed by the formula 


S= > MpZr 


where mp, is the mass of the kth point (k =1, 2, ..., n) and Zp 
is the coordinate of m, reckoned along an axis drawn perpendicular- 
ly to the plane (P) (see Fig. 304). 

If the mass is distributed over a spatial region (Q) we divide the 
region into n parts (AQ,), (AQ,), . . ., (AQ,) and consider an appro- 
ximate model in which the mass of each of the parts is concentrated 
at one of its points. This yields the approximate expression 

n 
Sip) & >) Pron AQ, 


a 
k=1 


which can be written in more detail as 
n 
Sm = >) p (Mn) z (Mn) AQ, 
Passing to the limit we thus obtain 


S= | pzdQ (8) 
i (2) 
The above calculations correspond to the first method mentioned 
at the beginning of this section. 
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If we wanted to use the second method we should write the expres- 
sion of the element of moment of mass: 


dsp) = pz dQ 


Summing, we should deduce the same formula (8). 
Knowing the static moment we can readily find the coordinate z 
of the centre of gravity of the solid in question: 


iy pz dQ 
gests) hae: 
m | pdQ 

(Q) 


The expression is simplified in the case when the solid is homogene- 
ous, that is when p = const. Then the centre of gravity is referred 
to as the geometric centre of gravity, and we have 


Peet tei = f zdQ (9) 
(2) 


The other coordinates of the centre of gravity of the body (Q) are 
found similarly. The static moments and the coordinates of the 
centre of gravity of a plane geometric figure with respect to a straight 
line lying in the plane are found in like manner. 

5. Geometric Meaning of an Integral Over a Plane Region. Such 
an integral, unlike other integrals defined in Sec. 2, can be directly 
interpreted geometrically. Its geometric meaning is similar to that 
of an ordinary definite integral conside- u 
red in Sec. XIV.2. Let us be given an 
integral of the form 

I= f udQ (10) 

(2) as 

where (Q) is a domain lying in a plane l 
(P) (see Fig. 305). Let us draw the 
u-axis perpendicularly to the plane and i 
construct a line segment of length u (M) Fig. 305 
parallel to the u-axis and passing through 
a point M belonging to the domain (Q). For simplicity’s sake we 
now consider positive values of u; then the segment is drawn in the 
positive direction of the u-axis and the end-point N of the segment 
lies above the plane (P). If u < 0 the segment is drawn in the nega- 
tive direction of the u-axis. When the point M runs throughout 
the domain (Q) the corresponding point N describes a surface (S) 
which is the graph of the integrand (such a graph was constructed 
in Fig. 190). The surface (S) together with the plane figure (Q) and 


ot u 
i 


il 
li 
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the cylindrical surface formed by the line segment parallel to the 
u-axis and drawn through each point of the contour bordering the 
domain (Q) bound a cylindrical body. 

The geometric meaning of integral (10) lies in the fact that it is 
equal to the volume of the cylindrical body. Indeed, the element of 
volume corresponding to a surface element dQ of the domain (Q) 
(see Fig. 305) containing a point M can be regarded as a right cylin- 
der with base dQ and height u (M) to within infinitesimals of higher 
order. Hence, this volume is approximately equal to dV = u dQ. 
Summing up these elements of volume we arrive at the formula 


Ve f udQ=I 
(Q) 


which is what we set out to prove. 

If u assumes negative values as well the volumes of the parts of 
the body lying under the plane (P) enter into the result with the 
sign minus (this situation is quite analogous to the one considered 
in Sec. XIV.2). 

By analogy with formula (XIV.28), we can pass from the volumes 
of cylindrical bodies to the volume of a body having an arbitrary 
form. If (Q) is the projection of the body on a plane (P) and k = 
= h (M) is the length of the line segment formed by the intersection 
of the body with a straight line which is perpendicular to (P) and 
passes through the current point M of the domain (Q), we have 
the expression 


yen f hdQ 
: ) 
for the volume of the body. 


$ 2. Two Types of Physical Quantities 


_6. Basic Example. Mass and Its Density. We now consider a mate- 
rial body whose density can be variable in the general case. Let us 
disregard the molecular structure of the body and consider its mass 
to be continuously distributed in space. According to this model 
such a body has a certain density p at each of its points M, and 
hence p is a function of a point of the form p = p (M) (see Sec. 
IX.9). In contrast to it, the mass cannot be regarded as a function 
of a point because the mass of a separate point is equal to zero. 
The mass is a quantity which is distributed in space which means 
that to each region (Q) (regarded as being mentally taken out of 
the space) there corresponds a certain value ma) of the mass. As 
before, (Q) is regarded here as a symbol designating the region itself, 
not its volume, 
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The connection between the mass and the density is as follows. 
Let the value mo) of the mass corresponding to each domain (Q) 
in space be known. Then the ratio 


m 
potas (2) 
Pav =—o9 


is referred to as the mean (average) density in (Q). The symbol Q 
entering into the ratio designates, as in Sec. 1, the volume of the 
domain (Q). Hence, Q is a quantity having the dimension of volume. 
[By the way, we shall sometimes refer to (Q) as a volume when there 
is no danger of misunderstanding. ] 

To obtain the density at a certain point M we must pass to the 
limit (compare with Sec. IV.1) by making (Q) contract to the point M: 


o(M)= lim po= lim —& (14) 

(Q)>M (jam & i 

This process is analogous to that of calculating a derivative. When 
we write (Q)— M we mean that (Q) is unlimitedly contracted to 
the point M. If (Q) > M we have, naturally, Q > 0 but the requi- 
rement that (Q) —> M is stronger than the condition 2 — 0 (why?). 
Thus, the density of mass distribution at a point is the mass of 
an infinitesimal volume related to unit volume. . 
Conversely, if the value p (M) of the density,#s 
point M the mass mq) corresponding to any g4 
be found on the basis of Secs. 1, 2 as thefių 


arst 
stance, i.e. if we return from our idealized. ty 
cannot even mentally contract the volume (Q 
ly. In this case, instead of formula (14), we mus 


Mag 
p (M) =- 

where (AQ) is a volume containing the point M which is regarded 
as being practically infinitesimal (see Sec. III.1). Consequently, 
the density of a real body at a point is the mean density in a volume 
which is sufficiently small relative to the dimensions of the body 
and at the same time sufficiently large relative to the molecular 
sizes. Hence, in these considerations we pass from the real discrete 
structure of a material body to its continuous model in which the 
density is obtained in the process of averaging based on the calcula- 
tion of the mean density corresponding to the volumes whose sizes 
were indicated above. 


38-0141 
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Further, when considering a continuous medium in our course, 
we shall always take the continuous model abstracting from the 
molecular structure of substance. 

7. Quantities Distributed in Space. There is a number of physical 
quantities which are analogous to mass in many respects. They 
possess the properties considered in the above example of mass 
distribution. Examples of such quantities are an electric charge 
in a dielectric, quantity of heat, energy of an electromagnetic field 
and the like. There is a feature which is common to all such quan- 
tities, namely, they are all distributed in space. In the general case 
we say that a quantity q is distributed in space if to each part (Q) 
mentally isolated from the space there corresponds a certain value 
qa) of the quantity. There is only one general requirement which 
we introduce here: the quantity must be additive, i.e. for any par- 
tition of (Q) into parts the value of the quantity corresponding to 
(Q) must be equal to the sum of the values of the quantity corres- 
ponding to the parts. Hence, if, for definiteness, (Q) is divided into 
two parts (Q,) and (Q,) we must have qa) = Yar) + Ko) 

A quantity distributed in space possesses a certain density at 
each point. Thus we can speak about the density of an electric 
charge, of field energy and so on. In the general case the density 
p is defined by analogy with formula (14): 


ọ(M)= lim 2 (12) 
(2)>M Q 


The ratio under the sign of limit in (12) is the mean (average) density 
of the quantity g in the volume (Q). The density p = @ (M) is then 
a function of a point. The density of the quantity q at a point M 
equals the value of q (corresponding to an infinitesimal region “placed 
at the point M”) related to unit volume. 

Conversely, if the density pọ (M) of a quantity q is known the 
quantity q itself is found by the methods described in Secs. 1, 2: 


gia) =lim J) p (My) AQ = J @(M) da (13) 


k=1 ®) 


where the limit is taken in the process in which the linear sizes 
ee parts forming partitions of the region are decreased unlimi- 
tedly. 
In the general case q and @ can take on the values of any sign. 
„Let us rewrite formula (12) so as to stress that we deal with infi- 
nitesimal volumes. To do this we substitute (AQ) for Q: 


IAQ) y q 
ra AEM; i.e. L =9 (M) +a 
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where g is infinitesimal when (AQ)— M. This implies 
q(ag) = F (M) AQ +-aAQ 


Thus, the value of q corresponding to a small volume (AQ) is di- 
vided into two parts the first of which is directly proportional to 
the volume AQ while the other is of higher order of smallness. The 
former summand is therefore called the differential or the element of 
the quantity q (compare with Secs. IV.7, 8): ; 


dq = ọ (M) AQ (14) 


This implies the physical meaning of dq: it is the value of q which 
would correspond to the volume (AQ) if the density were constant 
throughout (AQ) and were equal to the density at the point M. In 
reality, Aq, i.e. grag); does not equal dg in the general case and di- 
fiers from it by an infinitesimal of higher order of smallness. Hence, 
Aq and dq are equivalent infinitesimals when (AQ) — M (see Sec. 
IL.8). If it is possible to neglect such infinitesimals of higher order 
we simply say that dq is the value of g corresponding to an infinite- 
simal volume (AQ). We also call it an infinitesimal mass, an infi- 
nitesimal charge etc. in such cases. 

The volume can be regarded as a special case of a quantity q 
distributed in space, that is we can put qo) = Q. The correspon- 
ding density is then equal to unity (as “a volume related to unit 
volume”). Hence, formula (14) implies 


dQ = AQ 
in this case. Therefore formula (14) can be put down in the form 
dq = 9 (M) dQ (15) 
which is preferable (this resembles the corresponding formula in 


Sec. IV.9). 

Thus, summing up, we can write the basic formulas connecting 
a quantity q = qa) distributed in space and the corresponding 
function of a point (density) @ = p (M) in the form 


and qa)= | 9 (M) dQ 
(2) 


These formulas enable us to pass from one representation of a 
quantity to the other in all cases. The density is found by means 
of differentiating the quantity and the quantity itself is found by 
integrating its density. 

It should be noted that there are some quantities differing in 
their nature from such quantities as masses and charges which can 
also be considered to be distributed in space or over a surface or 


38* 


d 
0(M) = ay M 
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a curve. For instance, the static moment or the moment of inertia 
of a material body satisfies the conditions formulated in the defini- 
tion given at the beginning of Sec. 7 and thus they can be regarded 
as being distributed in space although they depend on the choice 
of a plane or of an axis. In practical problems we do not distinguish 
between such quantities belonging to different classes and simply 
write the expression dg according to the considerations given in 
Sec. 4 and then perform the summation (or integration) of the ele- 
ments on the basis of the additivity law and thus arrive at an ex- 
pression of form (7). ; 

A quantity can be distributed not only over a volume but also 
over a surface (plane or curvilinear) or along a curve. All the results 
obtained in this section remain valid for these cases if we interpret 
(Q) not as a three-dimensional domain mentally taken out of space 
but as a part of a surface (i.e. a region on a surface) or a part of a 
curve and understand Q as the area or length of the part, tha! is 
as its measure (see Sec. 1). 


§ 3. Computing Multiple Integrals in Cartesian 
Coordinates 


8. Integral Over Rectangle. We now consider an integral 


a f u dQ (16) 
) 

where (Q) is a rectangle bounded by coordinate lines of a Cartesian 
coordinate system arbitrarily chosen in a plane (see Fig. 306). The 
rectangle is described by inequalities a < x < b and c < y <d 
where a, b, c, d are some constants. 
When forming an integral sum S for 
integral (16) in the Cartesian coor- 
dinate system it is natural to break 
up (Q) into parts by means of 
straight lines parallel to the coordi- 
nate axes which divide the interval 
a<xz<b into parts Az; and the 
interval c << y < d into parts Ayi. 
Let us denote by u:p the value of 
eet the integrand u = u (z, y) at a 
Š point belonging to the subregion 
4 ` adjoining the intersection of the ith 
vertical line with the kth horizontal line (see Fig. 306). (Let the 
reader pay attention to the fact that the numeration of u:n does 
not coincide with the one used in the theory of matrices in Sec. XI 
where a;, designates the element of a matrix lying at the interse- 

ction of the ith row of the matrix with its kth column.) 
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We then approximately have 
IxsS= È wandaiAyn (17) 
t, 


where the summation is extended over all the subregions (small 
rectangles), i.e. over all the values of i and k (for instance, i = 1, 2,... 
, m and k = 1, 2, ...; n): 

A sum of form (17) with two summation indices is a two-dimen- 
sional integral sum. To compute it we must first perform summa- 
tion with respect to k for a fixed i, that is to sum up the summands 
forming a column (the ith column) of the table 


uy Un Un ».. Um 
U Um Use E E 
Uin. ün ‘Usn aaue Umn 
for each fixed value of i (i = 1, 2, ..-» m) and then perform the 
summation with respect to i. This results in 
m n m n 
s= >; ( uinAziAye ) = 5 ( >) umAyr) Az; (18) 
i21 \R=1 i=1 kS 


where we have taken outside the brackets the common factor ente- 
ring into the summands of the inner sum. The transition from a two- 
dimensional sum to a two-fold iterated sum, that is from (17) to 
(48), can also be performed in the reverse order: the first, inner, 
sum can be taken with respect to i and the outer, repeated, sum 
with respect to k. 

If the divisions along the y-axis are sufficiently small the sum 
inside the brackets in (18) is close to the corresponding integral: 


n d 
> windyn © (f udy), (19) 
k=1 c 
where the subscript i indicates that the value of z, which is fixed, 
is taken for the ith column of the above table. It follows that 


m 
Sx ( f u dy), Azi 
iy" %e 
But this is also an integral sum for function (19) which depends on 
x. Hence, if the divisions along the z-axis are also sufficiently small 


we can write ‘ 
d 


Sx j (| u (z, way) de (20) 


c 


a 
because the sum is close to integral (20). 
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In the process of decreasing the subregions of the partitions equa- 
lities (17) and (20) become more and still more accurate and turn 
into the precise relations in the limit. Consequently, 


b d 


ies f u dQ = \ (| u (z, y) dy) dx (21) 
(2) c 


a 


Thus, to compute an integral taken over a rectangle with sides 
parallel to the coordinate axes we can first perform the integration 
with respect to y, for a fixed z, as the variable of integration y varies 
within the rectangle (the inner integration) and then integrate tle 
result of the first integration (which depends only on z) with respect 
t Ta the corresponding limits of its variation (the outer integra- 
ion). 

The reverse order of passing from sum (17) to an iterated two- 
fold sum (see above) would yield 


i u dQ = Í ( u (x, y) dz) dy (22) 


Hence, when computing a double integral in Cartesian coordinates 
we have two ways of passing to a repeated (iterated) two-fold inte- 
gral (as it will be shown in § 4, an 
analogous passage to an iterated inte- 
gral can be performed in any coordi- 
nate system). It should be noted that 
one of the ways usually turns out to 
be more difficult for practical calcu- 
lations whereas the other is simpler. 
The transition from one of these ways 
to the other is referred to as the inver- 
sion of the order of integration. 

Formula (21) can be readily inter- 
preted geometrically. According to 
Sec. 5, integral (16) is equal to the 
volume of the solid depicted in Fig. 307. 
On the basis of Sec. XIV.10 we can compute the volume by integra- 
ting the cross section area shaded in Fig. 307. Hence, we obtain 

b b d 
j id= Vie j S (2) de=| (| way) dx 


(Q) a a@ oc 


The geometric meaning of formula (22) is analogous to that of (21). 
Hence, formulas (21) and (22) can be proved in a simpler manner 
but we have given a more complicated method of proving since 


Fig. 307 
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it can be automatically extended to multiple integrals of an arbitra- 
ry order. 

Bearing in mind formulas (21) and (22) we sometimes denote the 
original integral (16) as 


ti [naa Oy fi udz dy 


meaning that dQ = dx dy for a partition of the type shown in 
Fig. 306 when the divisions along the coordinate axes are sufficiently 
small. 

The computation of a repeated integral with constant limits of 
integration of form (21) becomes particularly simple when the inte- 
grand is a product of two factors each of which depends only on one 
variable of integration. Namely, if u (x, y) = fı (£) fa (y) we have 


b d b d 


IE ($ fı (2) fay) dy) dx = f fi (2) (j fa(y) dy) de = 
= (aoar f nena 


Thus we have obtained the product of two one-dimensional integrals. 

9. Integral Over an Arbitrary Plane Region. Let (Q), entering 
into integral (16), be an arbitrary plane figure lying in the z, y- 
plane. For instance, take the domain depicted in Fig. 308. The con- 
siderations given in Sec. 8 can be transferred to this case with some 
slight changes. Namely, instead of integral (19) we arrive at an 
integral of the form 


PA Gala) 
fum vav= | uv) dy 
y ga(2) 


where y = y; = Ẹı (2) and y = Ys = Pz (z) are, respectively, the 
equations of the upper and lower parts of the boundary of the do- 
main (Q). The contour bordering the figure (Q) is divided into these 
two parts by the points A and B (see Fig. 308). Accordingly, the 
final result [which substitutes for formula (21) in this case] will be 
of the form 
bp 2(*) 
r= | uaa= f ( | ue way) ae (23) 
(2) a a(x) 
Consequently, the limits of integration in the inner integral are 
variable in the general case; they depend on the variable of integra- 
tion in the outer integral (i.e. on z in our case). The character of 
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this dependence is specified by the form of the contour. But the 
limits of integration in the outer integral are constant as before. 
They are specified by the maximal range of variation of x. Hence, 
we see that the rule given after formula (21) remains valid for a 
domain (Q) of general form. 

Here we can also invert the order of integration, that is perform 
the first integration with respect to x and the second integration 


x 


Fig. 308 


with respect to y. Then in place of formula (22) we arrive at a for- 
mula of the form 


A da %p2(y) 
| uag= f ( \ w(x, y) dx) dy (24) 
(Q) ce paly) 


[let the reader find out what c, d, p, (y), pa (y) are by examining 
Fig. 308]. 
: It is sometimes necessary to break the domain of integration 
into several parts before setting up the limits of integration. 

For example, let it be necessary to invert the order of integration 


in the integral 
2x 


4 
t= | de | f(x, y)dy 
0 x2 


which can be written at length as 


1 2x 
Ta | (J f(x, y) dy) dz (25) 


To do this we must first determine the geometric form of the domain 
of integration. In this case it is bounded by the lines z = 0, z = 1, 
y = 2 and y = 2z (see Fig. 309), and the first, inner, integration 
is performed along the line segments parallel to the y-axis, the 
segments being shown in continuous lines in Fig. 309. 
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After the inversion of the order of integration the inner integra- 
tion will be carried out along the line segments parallel to the z- 
axis shown in dotted lines in Fig. 309. We see that after the order 
of integration has been inverted the inner integration is performed 


from the straight line x =+ to the parabola z = Vy for y <1 
and from the straight line x =4 to the straight line z = 1 for 


y > 1. The value of the variable y which corresponds to the divi- 
sion of the domain of integration into the two parts (in which the 


Fig. 340 


upper limits of the inner integrals differ) is found as the ordinate 
of the point of intersection of the parabola y = x? with the straight 
line z = 1, i.e. y = 1. Hence, after the inversion of the order of 


integration we obtain the sum of two integrals of the form 


r 


= 


1 y 
r= | dy | te vdet | Y\ 7, yde 
Be 


net 
sje mma = 


instead of formula (25). nae 

In more complicated cases it is sometimes necessary to divide 
the domain of integration into a greater number of parts. For example, 
to set up the limits of integration in Cartesian coordinates for an 
integral taken over the domain shown in Fig. 310 it is necessary 
to break the domain into five parts (what are these parts?). 

We now consider a simple example of an application of the double 
integral. By analogy with formula (9), we can easily deduce the 
formulas for the coordinates of the geometric centre of gravity of 
a plane figure (o): 


| f z dx dy { È y de dy 
eS (26) 


o 
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where ø designates the area of the figure (o). Let the whole figure 
(o) entirely lie on one side of the z-axis (see Fig. 311). Then the 
second formula (26) can be rewritten in the form 


Multiplying both sides by 2x and recalling formula (XIV.35) 
of the volume of a solid of revolution we arrive at Guldin’s second 
theorem: if a homogeneous plane figure rotates about an asis lying 


Fig. 312 Fig. 313 


in the plane of the figure and not intersecting it the volume of the solid 
of revolution thus obtained is equal to the product of the area of the 
figure by the distance covered by its centre of gravity. On the basis 
of the theorem we readily find the geometric centre of gravity of 
a semicircle of radius R (see Fig. 312): 


4 aR? = - + 27Yc 
that is 
4 


10. Integral Over an Arbitrary Surface. Let us consider the 
integral 
I= f udQ (27) 
(8) 
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taken over an arbitrary surface (Q) which can be curvilinear in the 
general case (see Fig. 313). To compute it in Cartesian coordinates 
we must consider the projection of the surface (Q) on one of the 
coordinate planes. For definiteness, let us take the projection of 
(Q) on the z, y-plane which we denote by (2’). 

Since the element (the area of an infinitesimal part) of a curvili- 
near surface can be regarded as being plane to within infinitesimals 
of higher order of smallness relative to the area, we have 


dQ! — dQ | cos a | = dQ | cos (m, 2) | 
where n is a normal vector to the surface. It follows that 
I= | udQ= f rea (28) 
(a) (a) | cos (n, z) | 
The last integral taken over the plane figure (Q') is computed by 
means of the methods of Sec. 9. 


Let the surface in question be represented by an equation of the 
form z =f (a, y). Then, according to Sec. XII.2, the vector 


Seine ae 
RS 95 * oy ji+k 


is directed along the normal to the surface at every point x, y, Z 
belonging to the surface. Hence (see Sec. VII.10) we have 


~ n-k ae TIA “aie ae ake 
conte STL 7 Ey a ey 
Ox oy 


Therefore, if integral (27) is given in the form 


Tie \ u (x, y, 2) dQ 
(2) 
we obtain, on the basis of formula (28), the expression 


1= | | ule v fe VEE TP + HP de dy 
(2) 
In particular, taking into account property 4 in Sec. 3 we derive 
the formula for the area Q of an arbitrary surface (Q): 


GA f f a2= | f VIF CI + C dz dy 
(Q) (Q) 
Here, as above, (’) is the projection of the surface (Q) on the z, y- 
plane and z = z (z, y) is the equation of the surface. 
When projecting a surface, we sometimes have to divide it into 
several parts. The projection on the planes y, z or z, z and the cor- 


604 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


responding computation of a surface integral are performed in a si- 
milar way when it is expedient. [Let the reader deduce the formula 


of | cos (n,z)| for a surface represented by an equation of the form 
F (x, y, 2) = 0.) 

41. Integral Over a Three-Dimensional Region. Let us now con- 
sider an integral 


fie f u dQ 

5 (2) 

where (Q) is a solid, that is a domain in space. We compute it follow- 
ing the procedure which was developed in Secs. 8 and 9 for an in- 
tegral over a plane figure. The corresponding integral sum is now 
represented as a three-fold iterated sum. In the simplest case when 
(Q) is a rectangular parallelepiped defined by the inequalities a < 
<2<b, cxy<d and e<z<f we obtain, after passing to 
the Timit in the integral sum, the formula’ 


b a Í 
e hg y, z) dz 


that is 
nyt 


=| ({ (fren a) ay) ae 


By the way, it is possible to perform here the integration by inver- 

ting the order of integration in five different ways because there 

a Sacra combinations (permutations) of the differentials 
cz, dy, dz. 

In the case of a domain of integration of a more general form the 
determination of the limits of integration will be more complicated. 
Suppose that we want to set up the limits of integration when inte- 
grating in the following order: 


I= J udQ= J de [dy | ule, y, 2) az (29) 
(8) 

Let the domain of integration be of the form shown in Fig. 314. 
Here the first (inner) integration is performed with respect to Z 
within the domain (Q) for fixed æ and y. Therefore the limits of 
this integration are z; and Z, (see Fig. 314), i.e. p, (z, y) and pə (x, y) 
where z = q; (x, y) and z = @z (z, y) are the equations of the upper 

and lower parts of the surface bordering the solid (Q). 
After the integration with respect to z and the substitution of the 
limits of integration have been performed the result of the first 
integration depends only on x and y. Now we pass to the projection 
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(Q') of the solid (Q) on the z, y-plane and perform the integration 
with respect to y (the second integration). When integrating with 
respect to y within the projection (Q') we keep x fixed. Thus the 
limits of the second integration are y; =p; (x) and y2 = %p2 (2), 
as it was described in Sec. 9. The result of the second integration 
will depend only on z. It should be integrated with respect to x 
over the maximal range of z, that 
is from a to b. This is the third, 
outer integration. Thus, after set- 
ting up the limits of integration 
we can put down integral (29) in 
the form 


f u dQ = 
(2) 
b tpa(x) Qa (X, y) 
= | de | dy f u(x, y, 2) dz 
a» (x) P1 (x, Y) 


The reader should pay attention 
to the fact that the limits of inte- 
gration in each integral depend 
only on those variables with res- 
pect to which the integration has not yet been performed. In 
particular, the limits of the outer integration cannot depend on 
the variables of integration and are constant. 

We can similarly set up the limits of integration in the other five 
possible cases of inverting the order of integration. As in Sec. 9, 
we sometimes have to divide the domain of integration into several 
parts when setting up the limits of integration if the domain (Q) 
is of a more complicated form. 


§ 4. Change of Variables in Multiple Integrals 


12. Passing to Polar Coordinates in Plane. As in the case of a 
one-dimensional integral, we can introduce different variables of 
integration when computing a double integral. Here we shall consi- 
der a typical example of computing a double integral in polar coor- 
dinates. Let us take an integral of the form 


= f udo 
(Q) 
where (Q) is a region in the z, y-plane which is depicted in Fig. 345. 


If it is necessary to perform the integration in polar coordinates 
we must divide the domain into parts by means of the coordinate 
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curves of the polar coordinate system, i.e. by the lines p = const 
and @ = const (see Sec. II.5), as it is shown in Fig. 315. Each of the 
elementary areas thus obtained can be regarded as being equal to 
a rectangle with sides dp and p dọ to within infinitesimals of higher 
order (why is it so?). Hence, we have 


dQ = p do dp 


Performing the summation over all the elementary areas we obtain 


nee f f up dp dp 
(Q) 


where the integrand must be, of course, expressed as a function of 
pand p. By analogy with Sec. 9, we 
set up the limits of integration and 
thus receive 


B fe (@) 
f uag= | do j updp . (30) 
(Q) & fa (0) 


The geometric meaning of the limits 
of integration is illustrated in Fig. 315. 
Polar coordinates are particularly 
Fig. 345 convenient for regions whose boundary 

consists of coordinate curves of the 

polar coordinate system because in 

such cases, when setting up the limits of integration, we obtain 
constant limits not only in the outer integral but also in the inner 
one. For example, after the limits of integration are set up, an 


ria, taken over the domain shown in Fig. 310 will have the 
orm 


ba 
IR 
| do | up ap 
<a 


13. Passing to Cylindrical and Spherical Coordinates. Let us 
take an integral 


I= | wag (31) 
(Q) 


where (Q) is a domain of space. If it is necessary to perform the in- 
tegration in cylindrical coordinates (see Sec. X.1) we have to divide 
the domain into parts by means of the coordinate surfaces of the 
cylindrical coordinate system, i.e. the surfaces p = const, p = const 
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and z — const. Then each of the elements of volume (see Fig. 316) 
can be regarded as being equal to the volume of the rectangular 
parallelepiped with dimensions dp, p dp and dz to within infinite- 
simals of higher order of smallness (relative to the element of volu- 
me). We suggest that the reader should verify this assertion. Con- 
sequently, we have 


dQ = dp-p dg-dz = p dp dọ dz 


Therefore integral (31) takes the form 


Te f f f up dp dọ dz 
(Q) 


where the limits of integration are still to be set up as in Sec. 11 
where we set the limits in Cartesian coordinates. 

If we use spherical coordinates (see Sec. X.1) the element of 
volume can be again regarded as being approximately equal to the 


Fig. 316 Fig. 317 


gular parallelepiped (see Fig. 317). 


volume of the corresponding rectan, 
llelepiped has the sides dr, r dO 


In this case the rectangular para 
and r sin 0 dp and thus we have 


dQ = dr-r dð-r sin 0 dp = 7° sin 0 dr dO dp 
Consequently, integral (31) takes the form 


T2 f ur? sin © dr d0 dọ (32) 
(8) 

set up in a particularly simple manner 

(and also in other systems) when the 

consists of coordinate surfaces because 


The limits of integration are 
in these coordinate systems 
boundary of the region (2) 
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in such a case not only the limits of the outer integration are con- 
stant but of the first and second integrations as well. 

As an example, let us consider the problem of determining the 
position of the geometrical centre of gravity of a solid having the 
form of a hemisphere of radius R. To do this we place the hemisphere 
as it is shown in Fig. 318. Then the symmetry implies that the centre 


Fig. 318 


of gravity will lie on the z-axis. Taking advantage of formula (9), 
passing to spherical coordinates by means of formula (32) and taking 
into account that z = r cos O we obtain the following expression: 


R 
=z | ap | æ | sind cosddr=2 R 
0 


0 0 


(Check up the calculations!) 

14. Curvilinear Coordinates in Plane. Besides Cartesian and polar 
coordinates, we can introduce many other coordinate systems in 
plane. Their common feature is that the points of the plane are always 
characterized by two coordinates (see Sec. X.2). 

We now consider a general coordinate system A, p whose coordi- 
nate curves A = const and p = const are depicted in Fig. 319. 
If these curves are drawn sufficiently close to one another the plane 
is divided into small figures (cells) which can be regarded as paral- 
lelograms to within infinitesimals of higher order. 

Let the curves A = const be drawn with the interval dd and the 
curves p = const with the interval du. We denote the sides of one 
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of the small parallelograms by ds, = MP and ds, = MN (see 
Fig. 319). If we neglect the infinitesimals of higher order we can con- 
sider these sides to be directly proportional to dA and dyu, i.e. 


ds; = b, dh, dSy = l; du (33) 


The quantities Z, and J, are called Lamé’s coefficients after G. Lamé 
(1795-1870), a French mathematician and engineer. The coefficients 
make it possible to compute linear sizes in a curvilinear coordinate 
system. For a given coordinate system, Lamé’s coefficients can have 
cifferent values at different points of the plane in the general case. 
For instance, Fig. 319 indicates that these values are smaller in 
the lower part of the plane than in the upper (why?). If it is neces- 
sary to calculate the arc length of a finite portion of a coordinate 
curve we must integrate the corresponding relation (33). 


Let us introduce the radius-vector r =r (A, u) = OM of a va- 
riable point M of the plane drawn from a fixed point O. Then the 


—> — . . 
sides MP and MN of the elementary parallelogram depicted in 
Fig. 319 are equal to 


Oxr=r,di and dr=rdw 


to within infinitesimals of higher order since these increments of 
the radius-vector are due to the variation of only one of the coor- 
dinates. It follows that | ôr | = |r; | dd. But, according to Sec. 
VII.23, we have | dr | = ds, and therefore | dar | = dsa. Hence, 
taking advantage of (33), we derive 


h,=|rh|, and similarly J.=|rh| 
If besides the curvilinear coordinates 4, u we introduce Cartesian 


coordinates x, y whose origin is placed at the same point O we shall 
have r = zi + yj (see Sec. VII.9), and consequently 


H+ I=V (a) +E) 


Cire ETEEN E EP P Ox N2 ôy \2 
#itij-y( 3 +(3) 
Let the reader derive the formulas lp = 1 and lẹ = p for a polar 
coordinate system whose origin coincides with the origin of the 
Cartesian system v, y first on the basis of formulas z = p cos g, 
y = p sin q and then directly by taking advantage of formulas (33) : 
The area do of any elementary parallelogram shown in Fig. 319 
is proportional both to dA and du, i.e. 


do = k dù du (34) 


h= 


and 


= 


39—0141 
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where k is a coefficient which can take on different values at diffe- 

rent points in the general case. Applying the formula of the area 

of a parallelogram we obtain 

do __ ds, dsy sin & 
dh du dh du 


k = 1,1, sina (35) 


where œ is the angle between the corresponding coordinate curves. 
In particular, for an orthogonal coordinate system, i.e. a system 
whose coordinate curves intersect at right angles, we have 


reli, (23) 


In the general case we can deduce from formulas (34) and (VIT.21) 
the formula 


oe ay 
Es we D(z, 4) : 
t=l|æ a [|= [De en 
ðu Op 


(see Sec. IX.13 on the last notation). In deducing the formula we 
apply the property of determinants (property 7 in Sec. VI.2) accor- 
ding to which a determinant does not change its value when being 
transposed. 

The same result can be obtained if we note that the coefficient 
k in formula (34) characterizes the change of areas under the mapping 
of the A, p-plane on the z, y-plane defined by the formulas z = 
= z (a, u), y = y (à, p). By Sec. XI.14, this coefficient is equal 
to the absolute value of the corresponding Jacobian, i.e. of the 
T a Y) entering into (37). 

[Let the reader deduce the formula k = p for a polar coordinate 
system taking advantage of formula (37). Do the same on the basis 
of formula (36). Find Lamé’s coefficients and the coefficient k for 
a Cartesian coordinate system.| 

If we take an integral of the form 


determinant 


f=\ udo 
j, 
taken over a finite plane region (o) we obtain, on the basis of for- 
mula (34), the expression 
=| f uk dù du (38) 
Xo) 


where the limits of integration must be set up by analogy with 
integrals (23), (24) and (30). The simplest case for setting up the 
limits of integration is when the region in question is bounded by 
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a contour consisting of arcs of the corresponding coordinate curves 
(why is it so?). [Let the reader verify that formula (38) turns into 
formula (30) in the case of polar coordinates.] 

15. Curvilinear Coordinates in Space. General curvilinear coor- 
dinates A, u, v in space are considered in a similar way. The surfaces 
Aà = const, u = const and v = const form three families of coor- 
dinate surfaces whose intersections generate three families of coor- 
dinate curves. The coordinate surfaces corresponding to the values 
A, A + dd; u, p + du and v, v + dv of the coordinates bound an 
elementary volume (cell) in space which can be regarded as a paral- 
lelepiped to within infinitesimals of higher order (the parallelepi- 
ped can be oblique in the general case). Such parallelepipeds are 
shown in Figs. 316 and 317 for the concrete cases of cylindrical and 
spherical coordinates (in these cases the parallelepipeds are rectan- 
gular). The length of one of the edges of the infinitesimal paralle- 
lepiped is equal to 


ds, =| ðar |= | 14 | dà = h, dh 


(to within infinitesimals of higher order) where 


= Ox \2 dy \2 dz \2 
n=iil=V (ie) + (3) + (ae) 
is a Lamé coefficient. The lengths of the other two edges of the pa- 
rallelepiped are expressed similarly. The volume of the parallele- 
piped is equal to dQ = k dà du dv where k is a coefficient which 
can take on different values at different points in space. Therefore 
the corresponding change of variables in a triple integral is per- 
formed according to the formula 


| udQ— | | fiuk axdu dv (39) 

(2) (2) 
In the case of an orthogonal coordinate system we have k = J,lyly. 
In the general case the coefficient k can be found on the basis of the 


geometric meaning of a triple scalar product of vectors (see Sec. 
VILAS: 


pee [Oar X Our) Oye] _ | (r dh X ru dy) ry dv | 
"= Indu dv  dhdudv dh dy dv 

ðr ôy ð 

I OX ON 


Ox oy Oz | D (x, y, 2) 
On OW D (^, u, v) 


=|( X ra) ro] =] 


Ox KA z 
ov ov ov 


39* 
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(Let the reader calculate the coefficients l}, lu, ly and k for Car- 
tesian, cylindrical and spherical coordinates.) 

16. Coordinates on a Surface. It is possible to introduce a coor- 
dinate system on an arbitrary surface (see Sec. X.6 and Fig. 211). 
Let us denote the coordinates by the letters 4 and u. After a manner 
of Sec. 14, we can express, to within infinitesimals of higher order, 
the sides and the area of an infinitesimal parallelogram bounded 
by the coordinate curves corresponding to the values A, A + di 
and u, u + du of the coordinates. Indeed, 


ds, = | ðar |= dn 


nalil=V (ae) + (at) + (ae) 


(ds, is expressed similarly). Thus, we have do =kdhdw where 
k= llu for an orthogonal coordinate system and 


where 


i j k 

On oy Oz 
k=|yxril=||a ow al|= 

Ox oy Oz 

Ce p 


= (22-24) +($S-FS Te ee 
ie Oh ðu ON Ow an Ou ON “al ee ae ae an 


in the general case. The transformation of a surface integral to the 
variables A and u is performed according to formula (38). 

As an example, let us consider the surface of a sphere of fixed 
radius R. The spherical coordinates @, 0 considered on the sphere 
are the example of curvilinear coordinates on a surface. These coor- 
dinates are expressed as ordinary spherical coordinates in space 
with a fixed value r = R of the radius. This is an orthogonal system, 
and Fig. 317 directly implies that 


ds,=RsinOdp, dsp=Rdd 
i.e. 
l =Rsinð, -la =R 


The coefficient k is expressed as k = lọle = R? sin O in this case. 
‘Consequently, an integral taken over a portion (øo) of the sphere 
(which can coincide with the whole sphere) is computed by the 
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formula 
| wao= 2 | J u sin 0 dọ dd (40) 
(9) (0) 


As an example, let us determine the force of attraction between 
a material point of mass m and the surface (o) of the whole material 
sphere of radius R with a constant surface density of mass p. Because 
of the symmetry, we can, without loss of 
generality, limit ourselves to the disposi- 
tion shown in Fig. 320. Every surface 
element do attracts the mass m with the 
force dF which can be found on the basis 
of Newton s law of gravitation: 


mp do m 
where x is the constant of gravitation. 

When summing up these elementary for- 
ces we must add together the projections of 
the forces on the coordinate axes but not 
the absolute values of the forces because 
they have different directions. The sym- 
metry implies that the resultant force is 
directed along the z-axis and therefore we have to add together 
the projections of all the elementary forces on the z-axis: 


Fig. 320 


F= | (dF).= | | dF] cosa = 
(5) (0) : 
24+ h2—R2 


do — srr 


AD ye ey BE 
5 f % F 72 —2Rh cos 0 
(6) 


(the expression of cos œ has been found on the basis of the cosine 
law which can be written as R? = I + k? — 2lh cos æ in this case). 

Substituting 1 = V R? + h? — 2Rh cos a into the last integral 
and passing to spherical coordinates according to formula (40) 
we obtain 


PE { h—R cos ð 3 Ao 
(6) (R24 h?—2Rh cos 0)? 
n 2n 
= xmp | dð f ALA an i sin 0 dp = 
D Ù (R24 n2—2Rh cos 0)? 
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mu 2n 
= Rump f Se E G) Jao 


0 (R2+h2—2Rh cos 0)? 0 


n 
= — 2n R?xmo f EK z dcos = 


0 (R24 h2—2Rh cos 0)? 


1 
= 2n R®xmo f E BN a y 


3 
Ti (24-42 —2Rht)* 


Performing the substitution R?+ h?—2Rht=/? (l>0), —2Rhdi = 
= 2Idl we finally deduce: 


2— R272 
(R= hiph 
2h 2ldl 
F=2nRumo f E (-ia) = 
Rh 
R RE h2—R 
mRxmo h2 — 
ee oar ty fee 
=hl 
R 4 1 5 
= [R hR- (Ghat) | @) 


If h >R we have |R—h|=h—R. Substituting this expres- 
sion into formula (42) we obtain (check it up!) the formula 


Fan ee nM (b> R) 


where M is the total mass of the sphere. Ifh < R we have | R — h |= 
= R — h, and thus we similarly find 


F=0 (h<R) 


Hence, a homogeneous sphere attracts material points lying 
outside it, as if the total mass of the sphere were concentrated at 
its centre, and does not attract points lying inside it. Now let us 
consider a material solid of spherical form with a spherically sym- 
metric mass distribution (that is the density depends only on the 
distance from the centre). Such a solid can be thought of as consis- 
ting of spherical layers of infinitesimal width bounded by concen- 
tric spheres. Each layer can be regarded as a material surface to 
which the above result can be applied. Thus, we conclude that 
the spherical solid attracts a point lying outside it as if the whole 
mass were concentrated at its centre. Similarly, a point lying inside 
the solid is attracted only by the portion of the solid which lies 
closer to the centre than the point. 
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§ 5. Other Types of Multiple Integrals 


17. Improper Integrals. The theory of improper multiple integrals 
is similar to that of one-dimensional integrals (see § XIV.4). We 
begin with the integral 


I= f udQ (43) 
(2) 


in which the integrand u is a. finite function and the domain of 
integration is unbounded (infinite). It is defined as the limit 


j udQ= lim | uaa’ (44) 
(a) Car OY 


where the domain (Q’) on the right-hand side is finite. This domain 
expands in the limiting process and exhausts the whole domain (9) 
in the limit (see Fig. 321). If limit (44) exists, is finite and inde- 
pendent of a particular way in which 

the domain (Q') expands integral (43) is Boundary of 

said to be convergent. If otherwise the “e domain Dee 


integral is referred to as being divergent. ir 
If limit (44) equals infinity we write ga 
u dQ = oo. If u>O0 integral (44) Ja 
e l 3 j Boundary of 
either converges or diverges and j udQ = the domain (') 
(Q) A 
= -oo (i.e. it is divergent to infinity). Fig. 321 


In such a case we can set up the limits 

of integration in any coordinate system (convenient for the com- 
putation) by the rules givem in §§ 3, 4. If the result of the sub- 
stitution of the limits is finite the integral is convergent and if 
otherwise the integral is divergent. The comparison tests [see 
(XIV.49) and (X1V.50)] remain valid in this case. In applying the 
comparison tests we can use integrals (XIV.51) and some other 
integrals. For instance, in the case when (Q) is the whole plane we 
often perform the comparison with an integral of the function r= 
where r =Va2-+y® is the length of the radius-vector. It is only 
the behaviour of the integrand for large values ofr that is essen- 
tial for the convergence of an integral of this type, and therefore 
we must investigate the integral 


j de dy = | do | r?rdr 2a | svar ("o> 0) 
TO 


(r>To) 0 To 
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(we have passed to polar coordinates here). According to Sec. XIV.15 
[see formula (XIV.51)], the integral is finite for p >2 and infinite 
for p <2. Similarly, in the three-dimensional space x, y, z the 
integral of r? = (V z? + y? + 27)-? (taken over the whole space 
with a sphere of radius rọ >0 and centre at the origin of coordinates 
cut out) converges only for p >3. 

As an example of an application of improper multiple integrals, 
let us deduce formula (XIV.70). We take the integral 


I=] {ar yP elati) Y dz dy (p>0, q>0) 
0 0 


whose domain of integration is the first quadrant of the x, y-plane. 
The integrand being a positive function, we can perform the inte- 
gration in any order. This yields the following results: 


MESE f dy \ gP—lypta-le-(@+1) y dz = 
v 0 


jaf greets 
0 0 


(we have made the substitution z=) . Calculating we find: 
I= j sP—le-sdqs. f yt tey dy =T (p)T (q) 
0 0 


œ 


(2) I= f def aP-1ypta-ie-(2+1) u dy — 
0 


= p-1(_t_\pra-t 4 dt 
eja a) Ai] 


(we liave employed the change of variable y=z) . Now, ap- 
plying formula (XIV.73) we deduce: 


œ ie o 
Io Jopa [edt Be, TP+ 


Comparing the results we derive the desired formula. 
If the function u takes on the values of both signs and 


J |ujaa<oo (45) 
(2) 
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integral (43) converges and is said to be absolutely convergent. 
In evaluating such an integral we can choose any convenient coor- 
dinate system and set up the limits of integration. It can also be 
proved that if condition (45) is violated integral (48) is divergent. 
In this case limit (44) can depend on the way in which the domain 
(Q’) expands. Then it can happen that after setting up the limits 
of integration and evaluating the integral we obtain a finite result 
in one coordinate system, an infinite result in some other system, 
a fmite result but different from the first one in a third system and 
a divergence of an oscillating type in a fourth coordinate system 
(see Sec. XIV.14). Hence, in such a case the possibility of changing 
variables and inverting the order of integration should be additional- 
ly investigated. There are no such problems in evaluating absolutely 
convergent integrals. 

Improper multiple integrals of other types are treated similarly. 
Namely, if the domain of integration contains a point, a curve or 
a surface on which the integrand approaches infinity, the singula- 
rity (i.e. the point, the curve or the surface) is cut out of the domain 
together with its neighbourhood and then the boundary of the cut- 
out portion is contracted to the singularity in an arbitrary way. 
The limit thus obtained is taken as the value of the improper integral 
provided this limit is finite and independent of the way of the con- 
traction. If an improper integral of this type of a function which 
is positive everywhere or positive near its singularities converges 
the integration can be performed in any coordinate system. The 
same can be done for arbitrary functions if they are absolutely 
integrable. When investigating an integral whose integrand has an 
isolated singularity, that is a separate point at which the integrand 
approaches infinity, we often apply the comparison test using the 
integral f f r dx dy in plane and the integral į j f r-” dx dy dz 

r<ro TSO 
in space. We can easily verify that the former converges only for 
p< 2 and the latter for p < 3. If there is a non-isolated singularity 
then in investigating the convergence it is convenient to choose a 
coordinate system in such a way that the singularity should coin- 
cide with one of the coordinate curves. or surfaces. 

18. Integrals Dependent on a Parameter. Let us consider integrals 
of the form 

I()= j f(M, 2) dQ 
(2) 
where M is a variable point in the domain (Q) whose coordinates 
are the variables of integration and A is a parameter which is kept 


constant in the process of integration. The theory of such integrals 
is developed by analogy with Sec. XIV.5. All the basic assertions 
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proved there remain valid here. There is a difficulty in this case 
which arises when the domain of integration also depends on the 
parameter. In such cases we often try to change the variables so 
that the new domain of integration should become fixed. But, of 
course, such integrals can also be investigated directly. 

For instance, let us consider a triple integral of the form 


e 


Rvs SJ f(M) dQ 


(0), ($, v, 2) <0) 


where the function @, (x, y, z) depends on the parameter A and the 
domain of integration is a region in which ~, < 0. Let it be neces- 


sary to compute the derivative S The expression dI coincides, 


to within infinitesimals of higher order, with the difference Z (A -+ 
+ dh) — I (A) which is equal to an integral of f (M) taken over 
a thin layer bounded by the surfaces (61) and (S,+a,) with the 
equations q, = 0 and qa, = 0, respectively. Let us take a point 
A on (S,) and draw the normal to (S,) at A. We reckon the distances 
from A along the normal and consider them positive in the direction 
to the region where p}, >0, that is in the direction of outer normal 
to the boundary surface of the domain of integration. We now de- 
signate the point of intersection of the normal with the surface 
(Sataa) by A. Then the quantity dn = AA is equal to the width 
of the layer at the point A. We have pa (A) = 0 because the point 


A belongs to (S1). We also have papax (4) = 0 (why?). Now we can 
write the relation 


Pavan (A) = @,(A) +2 dh =, (A) +| grad | dn IN 


yoh is accurate to within infinitesimals of higher order. It follows 
a 


ap 

op 5 On 
rad dn--—- di = 2 
| grad p} | dn- mt 0, ive. dn feed ei 


The element of volume of the layer can be written as dQ = dS dn 
where dS is the surface element of (S;). Consequently, we have 


The quantity dn- can be positive or negative here. Its sign indicates 
whether the volume of the layer is added to the volume of the ori- 
ginal domain (corresponding to the value À of the parameter) or sub- 
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tracted from it. It follows that 


a9 
ar=$ff7(—p2cq asa) Rat u 
op 
-ipon pa 
=d) i 


An integral can depend on several parameters. The coordinates 
of a point N varying within a certain region can play the role of 
such parameters. An integral of this type is written as 


1(N)= J fM, N) dQw 
(Q) 
where the symbol dQ» indicates the fact that it is the moving point 
M whose coordinates are the variables of integration (see Sec. 2). 
The point N which can occupy different positions is kept fixed in 
the process of integration. Its coordinates are the parameters. The 
basic properties of integrals dependent on a parameter (see Sec. 
XIV.5) are easily extended to these integrals. 

An integral can be integrated with respect to a parameter on 
which it depends. This results in a multiple integral of higher order. 
For example, let us consider the problem of calculating the force 
F of attraction between two material bodies (Q,) and (Q) whose 
densities p, and pz can be variable in the general case. Let us take 
two elements of volume dQ, and dQ, placed at some points M, and 
Mz belonging to (Q,) and (Q), respectively. The force with which 
the element dQ, attracts the element dQ, can be expressed on the 
basis of Newton’s law of gravitation: 


ddF=x 4 dQy- P2 dRo MM)? Rey 012M My dQ, dQ 
[MMi |è | Maa | 


—_> ——> 
where (M.M,)° is the unit vector in the direction of the vector MM4. 
Now, integrating over (Q;), we obtain the force with which the whole 
body (Q,) attracts the element dQ»: 


dF=%x AA) p2 dR 
(21) | MoM |? 


In the above integral the integration is performed with respect 
to the coordinates of the variable point M, running throughout 
the domain (Q,) when the point M» is arbitrarily fixed in the domain 
(Qs), the coordinates of M, being parameters. To obtain the resul- 


tant force of attraction we must additionally integrate with respect 
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to the coordinates of Mz: 


MM, 
F=x | pdo, \ -CEE dO; 
(22) (21) | MoM, |? 


If we set up the limits of integration we shall obtain a six-fold 
iterated integral. For instance, in Cartesian coordinates we shall 
have 


F=% j f f Po (Lo, Yo» Z2) dLa dys dz, x 
(22) 


xX f pi (wt, Yay z1) [(@1— z2) i+ (y1 — Y2) j+ C1— 22) k] dz, dy, dz, 


Sai [(e2— 21)? + (w2 — y1)? + (zz—1)?]*/2 
The limits of integration which are to be set up in the above integral 
depend on the form of the domains (Q,) and (Q3). 

19. Integrals with Respect to Measure. Generalized Functions. 
For definiteness, let us consider triple integrals. In Sec. 1, the volume 
of a domain in space was called its measure. But this is only the 
simplest example of a measure which is referred to as the Lebesgue 
measure (after the French mathematician H. Lebesgue, 1875-1941, 
who developed the general theory of the measure). It is also possible 
to introduce other measures with respect to which integration can 
be performed. 

We begin with an example. Let a mass m be distributed in space 
(see Sec. 6). By analogy with Sec. 18, we can readily deduce the 
expression 


-> 


NM 
INM |è 
of the force with which the distributed mass m acts upon a mass 
point mo placed at the point N. Here M is a variable point in space 
whose coordinates are the variables of integration and the integra- 
tion extends over the whole region of space in which the mass m 
is distributed. 

If the mass m is distributed continuously, that is if the mass of 
every surface, curve or point is equal to zero, we can introduce the 


density p (see Sec. 6), and then integral (46) can be reduced to an 
, ordinary triple integral 


F=xm f dm (46) 


NM 
| — aa (47) 
| VM | 
But we sometimes deal with a mass which is concentrated on 


separate surfaces, curves or at points. Then the ordinary notion of 
density cannot be applied and hence it is impossible to pass to 
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integral (47). In such cases we have to consider integral (46) as an 
integral with respect to the measure m. 

The general definition of a measure in space is analogous to the 
basic definition given in Sec. 7. To every part (Q) of space which 
can be mentally isolated from it (i.e. to each solid, surface, curve 
or point etc.), a certain value pg) of the measure u must correspond. 
The condition of additivity is also imposed on u. The measure of 
a surface, curve or point may not be equal to zero in the general 
case. We usually deal with non-negative measures p > 0 but it 
is sometimes expedient to consider signed measures (also called 
charges) which can assume values of either sign. In such cases the 
measure can be thought of not as a mass but as an electric charge 
which can be positive or negative. A measure can be defined not 
only in space but also on a surface or curve. 

The definition of an integral with respect to measure is similar 
to the ordinary definition given in Sec. 2 (integrals with respect 
to measure are also called the Stieltjes integrals). Let us take a do- 
main (Q) in space. If a measure p and a function u (M) [where M 
is a variable point running over (Q)] are defined in the domain (Q) 
the integral of u with respect to the measure u is defined as 


n 
\ u du = lim a u (Mr) Mag) (48) 
(2) k=1 
where the meaning of the notation is obvious. Such an integral 
always exists provided the function u is finite in (Q) and the measure 
of the whole domain (Q) is also finite (if u is not non-negative we 


must additionally impose the condition that f | du |< œ which 


(Q) 
means -that the positive and negative variations (parts) of p should 
also be finite). If the function u is discontinuous the form of inte- 
gral sums used in (48) must also be specified but we shall not dis- 
cuss this question at length here. Improper integrals with respect 
to measure are defined after a manner of Sec. 17. The properties 
of integral (48) are similar to those discussed in Sec. 3. When we 
speak about the properties related to integrating inequalities we 

must additionally impose the condition u > 0. 
If the measure of every surface, curve or point is equal to zero 
it is possible to pass to an ordinary triple integral taken over a 


volume: 
du i _ a 
udu= | uth aa=| upade (p=%5) (49) 
(8) (2) (Q) 


This transition can be performed for any measure but in the general 
case p will be a generalized function. 
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The simplest generalized function in space is the delta function 
ô (x — a) 8 (y — b) ô (z — 0) (50) 


(see Sec. XIV.25) which describes the volume density of unit mass 
placed at the point (a, b, c). The function ô (y — b) ô (z — c) des- 
cribes the volume density of a mass uniformly distributed along 
the straight line y = b, z = with unit linear density. The func- 
tion 6 (z — c) is the volume density of a mass uniformly distributed 
over the plane z = c with unit areal (surface) density. Using these 
functions and some other generalized functions (in particular, delta 
functions depending on curvilinear coordinates) we can perfo:m 
transformation (49) in the general case. 

The properties of the generalized functions of several variables 
are similar to those of functions of one variable (see Sec. XIV.27). 
Generalized function (50) can be applied to constructing an influence 
function (Green’s function; see Sec. XIV.26) of the form 


G (M. N) = G(z, Y: 2, $; un ¢) 


where (x, y, z) are the coordinates of the point M (point of obser- 
vation) and (Ë. y, č) are the coordinates of the point N at which 
the source (producing the corresponding action) is placed. When 
investigating processes developing in time we also use the delta 
function 
ô (z — a) ô (y — b) ô (z — c) ô (t — 1) 

which yields an influence function of the form G (M, t, N, 1). 

20. Multiple Integrals of Higher Order. A measure can also be 
defined in a k-dimensional space or, as we say, in a k-dimensional 
manifold (see Sec. X.2). The definition of an integral of form (48) 
and its basic properties remain unchanged in this case. To pass to a 
repeated integral we must introduce generalized coordinates 4, tz, --- 
. - +» t in the manifold (see Sec. X.2), express the integrand as a 
function of the form u = u(t, ..., &) and find the density 
p(t, ..., ta) which defines the element of measure du = p (t. - - - 


ey by) dt, dt, . .. dt, corresponding to an infinitesimal genera- 
lized k-dimensional parallelepiped placed at the variable point 
(tj, . . ., t) and bounded by the corresponding “coordinate surfa- 


ces” [which are (k — 1)-dimensional submanifolds in the general 
case]. Then integral (48) takes the form 


j udu — Jf iS Jute tay «8 WtR)D inte <5 ta) ey os Oty’ (51) 


(Q) ——. 


k times 


where the limits of integration must be set up on the right-hand 
side according to the ranges of variation of the coordinates t, . . -, tr- 

The density p entering in formula (51) is understood as an ordi- 
nary function if the measure of every submanifold of dimension 


MULTIPLE INTEGRALS 623 


s< k (which can be defined by means of one or more equations 
connecting the coordinates 4, ..., ta) is equal to zero. In particu- 
lar, this is the case if the density is finite everywhere. 

If otherwise, p should be understood as a generalized function 
(see Sec. 19). 

If the notion of a volume (hypervolume) is introduced in the 
space under consideration we can perform the integration with 
respect to the volume which is a particular case of a measure. To 
do this we must know the expression 


dQ =h (ty, «. «5 ty) dt, dtz ... dtr (52) 


of the volume of an infinitesimal generalized parallelepiped bounded 
by the corresponding coordinate surfaces (submanifolds). Then the 


integral j u dQ can be transformed by analogy with (51). 


(a) 

The notion of an integral (with respect to a measure or hyper- 
volume) over a domain belonging to any submanifold of lower di- 
mension lying in the initial k-dimensional manifold is introduced 
in a similar way. In the ordinary three-dimensional space we can 
consider line integrals, surface integrals and triple integrals but 
in a k-dimensional space there are k different types of integral (what 
are these types?). 

For the k-dimensional Cartesian space Æ, (see Sec. VII.18) we 
put k = 1 in formula (52), i.e. we take the volume of unit &-dimen- 
sional hypercube with unit sides as the unit measure of hypervolume. 
Integrals of lower order in this space are defined under the conven- 
tion that the p-dimensional volume (1 < p < k) of a p-dimensional 
rectangular parallelepiped (finite or infinitesimal) is equal to the 
product of the lengths of its sides (this is the Lebesgue measure). 

By analogy with Sec. XIV.23, we can consider integrals with 
respect to coordinates taken over a p-dimensional manifold (S) 
(1 < p < k) lying in Er. But (S) must be orientable in this case. 
This is a new notion which cannot be easily visualized for p >1, 
and we are going to discuss it at length here. 

First of all we shall introduce the notion of a p-dimensional tetra- 
hedron. By definition, a one-dimensional tetrahedron is.a line seg- 
ment, a two-dimensional tetrahedron is a triangle and a three- 
dimensional one is a triangular pyramid. To obtain a four-dimen- 
sional tetrahedron we take a three-dimensional tetrahedron lying 
in a three-dimensional space which is considered to be a subspace 
of some space of dimension s >3. Then we take a point belonging 
to the s-dimensional space which does not belong to the three-dimen- 
sional subspace and connect this point by line segments with all 
the points of the three-dimensional tetrahedron. The totality of 
all the points belonging to these line segments is a four-dimensional 
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tetrahedron. The tetrahedrons of higher dimensions are constructed 
similarly. Now take a p-dimensional tetrahedron with vertices 
A;, Ag, .- + Appi. Its orientation is specified by enumerating these 
vertices in a certain order. It is assumed that the permutation of 
any two vertices changes the orientation to the opposite one. For 
instance. if we take a three-dimensional tetrahedron with vertices 
A, B, C, D the combinations ABCD and DBAC define the same 
orientation whereas the combination CBAD defines the opposite 
orientation. Every tetrahedron can be oriented in two different 
ways. 

If we take an arbitrary small p-dimensional tetrahedron on a 
p-dimensional manifold (S), choose a certain orientation for it 
and then make it run over the manifold, the original orientation 
of the tetrahedron will induce the 
orientation of all small p-dimensional 
tetrahedrons on (S). Then we say 
that (S) has been oriented. For p = 1 
a manifold of the type (S) is a curve, 
and the above method of orientation 
is equivalent to specifying a certain 
direction on it. For p = 2 such a ma- 
nifold (S) is a two-dimensional sur- 
oe face, and its orientation is equivalent 
to specifying the direction of describing the contour of any small 
region on (S). If (S) consists of several disjoint portions they can 
be oriented independently. 

It should be taken into account that in the case p >2 some mani- 
folds cannot be oriented. The so-called Möbius strip (see Fig. 322) 
discovered in 1858 by the German geometer A. F. Möbius (4790- 
1868) is the simplest example of a non-orientable two-dimensional 
surface. 

A p-fold multiple integral with respect to coordinates taken over 
an oriented p-dimensional manifold (S) lying in Egy is defined as 


n 


f xe ju (tis e th) dims dima +» dimp = lim: X u (Mr) ASh (53) 
(5) k=1 


where the summation on the right-hand side is extended over all 
small tetrahedrons (AS;) into which (S) is divided, the orientation 
of the tetrahedrons being coherent with the orientation of (S). 
The symbol AS; (k =1,..., n) designates the p-dimensional 
volume of the projection (AS;) of the tetrahedron (AS; on the hyper- 
plane with the coordinates tmp tmz +++) tm. taken with the sign + 
or — depending on whether the orientation of the projection AS; 
(which is also a tetrahedron) coincides with the orientation of the 
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tetrahedron = OCm,Cm,..-Cm. where Cy; GG =1,...4 4) is a 
point lying on the ¢;-axis whose distance from the origin of coordi- 
nates is equal to unity. All the indices m,, ms, .. ., My are suppo- 
sed to be different, if otherwise integral (53) is considered, by defi- 
nition, to be equal to zero. 

The properties of integral (53) are analogous to the properties 
of the integrals investigated in Sec. 3 with the exception of those 
related to the integration of inequalities. If the orientation of (S) 
is changed or if any two differentials under the integral sign are 
permuted the integral is multiplied by —4 (why?). We also consider 
the sums of integrals of the form 


k 


Jesel D Umans.) mp (tiale +5 th) dma dima ».. dtm, (54) - 


(S) mi, ..., mp=1 


An integral taken over an ordinary two-dimensional oriented 
surface lying in the ordinary three-dimensional space with the coor- 
dinates x, y, z which can be written in the form 


| | P(e, wie) dedy +Q(@, y, 2) dy dz +R (z, y, 2) dz dz 
(s) 


is an example of integral (54). 

When setting up the limits of integration in integral (53) we can 
express all the coordinates ¢; different from fm, tm,, ---, tm 
as functions of tm,, ma» tmp for the points of the manifold (S). Then 
we substitute these expressions into the integrand which thus be- 
comes a function of tm,, tm, +++» tm, divide (S) into parts whose 
projections on the hyperplane with the coordinates tm,, tm, <- + tm, 
are of the same orientation and then set up the limits of integration 
in the integrals over each projection. The latter integrals are evalua- 
ted as ordinary p-fold multiple integrals taken over a p-dimensional 
domain. We can also introduce some convenient curvilinear coor- 


dinates s4, ..., Sp on (S) and then pass from the differentials 
dim, .. +, dtm, to the differentials ds;, ..., dsp. To do this we 
substitute the expression 
DAA 
E HNR EE A 
REET] lds, ... dsp 


for dtm, ..+ dimp under the sign of integration and set up the cor- 
responding limits of integration. 
40-0444 
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§ 6. Vector Field 


Multiple integrals are directly applied to the theory of vector 
field. Here we shall discuss some of the applications. The reader 
should recall the definition of a field given in Sec. IX.9 before pro- 
ceeding to study the subject. 

21. Vector Lines. We say that there is a vector field A (field 
of vector A) defined in space if the value of the vector quantity A 
is specified at each point M of space, i.e. A =A (M). We shall deal 
with a stationary field which does not change as time passes. If 
such a variation takes place we shall consider the field at a fixed 
moment of time and thus reduce our considerations to a stationary 
field. As examples of vector fields, we can consider the field of velo- 

city v, the field of momentum density pv (where p is the density 
of mass distribution) for a flow of a liquid or gas, the field of force F, 
the electric field E (where E is the electric field strength) etc. 
_ A curve (L) which is tangent to the vector A at each point is 
called a vector line. In other words, this is a curve whose direction 
(i.e. the direction of its tangent) coincides with the direction of the 
field at each point belonging to the curve. Depending on the physi- 
cal meaning of the field in question we speak about a stream line 
(flow line) of a field of velocity, a line of force of a field of force and 
so on. (Let the reader think why the stream lines coincide with the 
Poeun of the particles of liquid only in the case of a stationary 
eld. 

From the geometrical point of view the problem of constructing 
vector lines of a given vector field is equivalent to that of construc- 
ting integral curves for a given direction field (see Sec. XV.12). The 
problem is therefore reduced to integrating the corresponding sys- 
tem of differential equations. For this purpose it is necessary to 
introduce a coordinate system in space. For instance, if we take 
Cartesian coordinates x, y, Z the vector A can be resolved according 
to, the formula 


A = A(x, y, 2) = Ax (z, y, 2) i + Ay (z, y, 24+ 

+ Az (x, y, 2) k (55) 
On the basis of Sec. XV.12, we can put down the symmetric system 
of differential equations for the vector lines of the field A 

es EON AU aay LN gt 
Ax (ty, z) Ay (%¥,2) Az (@, y, 2) 

[compare this with equation (XV.66)]. In the case of a plane 
field (see Sec. IX.9) the system turns into the equation 

ME SEONG. SE 

Ax (z, y) Ay (z, y) 
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As it was shown in Chapter XV where the theory of differential 
equations was studied, there is only one vector line passing through 
a non-singular point. Thus, the whole region in which a vector field 
is defined is filled with vector lines of the field. In a sufficiently 
small domain containing a non-singular 
point the totality of the vector lines re- 
sembles the set of parallel segments which 
can be curved a little. In the vicinity of 
a singular point the family of vector lines 
can have a very complicated structure 
(see Fig. 290 which represents some examp- 
les of this kind). 

22. The Flux of a Vector Through a Sur- Fig. 323 
face. Let a vector field be defined in a do- 
main of space and let an oriented surface (0) (which can be closed or 
non-closed) lie in the domain. We remind the reader that orienting 
a surface is equivalent to indicating its outer and inner sides 
(see Sec. VII.11). The flux of a vector field A through a surface (0) 
is the surface integral 


Q= | Ando 
(0) 
where A,, is the projection of the vector A on the unit outer normal 
n to (o). Using the notion of the vector of an area (see Sec. VII.11) 
and the properties of a scalar product of vectors (see Sec. VII.2) we 
can rewrite the expression of the flux in the form 


Q= j A cos a do = j A-do 
(9) = (0) 
(see Fig. 323). 
If the vector field A is represented in form (55) we can apply 
the transformation 


A-do = A-n do = (Axi + Ayj + 42k): (cos (n; z) i-+ cos (n, y) j+ 
-+ cos (n, 2) k) do = A, cos (n, 2) do + Ay cos (n, y) do -+ 
-+ Az cos (n, 2) do 
to calculating the flux. Thus, the integral | A-do is represented 
(0) 
as the sum of three integrals. The first integral f Ax cos (n, z) do 
(9) 
can be computed if we use the relation cos (n, x) do = + do, where 


do,, entering into the right-hand side is the surface element of the 
projection (Ox) of the surface (o) on the y, z-plane and the sign is 


40* 
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specified by the sign of cos (n, z). If we have the sign + everywhere 

we can write 


\ A, cos (m, z) do = f Az dos = j f As (z£, y, 2) dy dz 
(0) (Gx) (Ox) 


When calculating the integral we must substitute the function 
x (y, 2) [where z = z (y, z) is the equation of the surface (o)] into 
the integrand. The case when we have the sign — everywhere is 


treated similarly. If cos (n, x) changes its sign on (o) we must break 


(o) into several parts so that cos (n,z) should retain its sign on 
each part and then compute the integrals taken over the parts as 
it was indicated above. The other two integrals entering into the 


expression of | A -dø are evaluated similarly. 


(o) 

Obviously, the flux is a scalar quantity. Since it is a particular 
case of a surface integral it possesses all the properties of this inte- 
gral (see Sec. 3). Here we point out a characteristic property of 
a flux: it is multiplied by —4 when the orientation of the surface 
is changed because this yields the change of the sign of An. The 
value of a flux is essentially dependent on the mutual disposition 
of the surface (o) and the vector lines of the field. Indeed, if the 
surface (o) is everywhere intersected by the vector lines from its 
inner side to the outer side (the direction of a vector line at a point 
is indicated by the vector of the field at this point) we have Q >0; 
if otherwise we have Q <0; finally, if some of the vector lines 
intersect the surface in one direction and some in the opposite di- 
rection the flux is equal to the sum of a positive and a negative 
quantity (what are these quantities?) and thus it can be positive, 
or negative, or equal to zero. The flux is always equal to zero in 
the case when the surface is totally covered by the arcs of the vector 
lines because the vector A is tangent to such a surface at each point, 
and hence A, = 0. í 

The physica] meaning of a flux depends on the type of the field. 
For instance, let the velocity field v of a gas flow be considered. Then 
the quantity 

dQ = v-do 
is equal to the volume of an elementary gas cylinder passing through 
the area (do) in unit time (see Sec. VII.11). Consequently, in this 
case the entire flux is equal to the volume of gas passing through 
the surface (o) in unit time from its inner side to the outer side. 
We can similarly verify that in the case of the field A = pv the 
flux is equal to the mass of gas passing through (o) in unit time. 
(Let the reader think what are the implications of the properties 
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of a flux enumerated in the preceding paragraph in the case of the 
above examples.) 

The flux of a vector field A through a surface (o) is sometimes 
referred to as the number of vector lines of the field A intersecting 
(o) from its inner side to the outer side. This is of course a conditio- 
nal term because the numerical value of a flux can be a fractional 
number and besides a flux is usually a dimensional quantity. But 
nevertheless this terminology is commonly used because of its 
convenience. It should be taken into account that the number of 
vector lines understood in the above sense is an algebraic quantity. 
For instance, if some portion of the surface (9) is intersected from 
the inner side to the outer side and the other portion is intersected 
in the opposite direction the total number of the lines intersecting 
(o) can be positive or negative or zero depending on the portion 
which is intersected by a greater number of lines. 

23. Divergence. Let us take a volume (Q) bounded by a surface 
(o) and lying in a domain of space where a vector field A is defined. 
Suppose that the closed surface (o) is oriented so that (2) adjoins 
its inner side. The flux of the field through the surface is equal to 


the integral 
Q= § A-do 


(0) 


(the symbol § indicates that the integral is taken over a closed 


surface; of course, we can write the ordinary sign of integration 
instead of this symbol). If the flux is positive this means that the 
number of vector lines passing through (0) from the interior of the 
domain (Q) exceeds the number of lines passing in the opposite 
direction. In this case we say that there is a source (positive source) 
of vector lines in (Q). The quantity Q characterizes the source strength. 
If Q < 0 we say that there is a sink in (Q). A sink is usually termed 
as a negative source. For the sake of simplicity we shall always re- 
gard a sink asa particular case of a source. If Q = 0 this means that 
either there are no sources and sinks in (Q) or they mutually com- 
pensate. By the way, in the case Q  O there can be both sources 
and sinks in (Q) which do not completely compensate one another 
in this case. The model based on the notion of vector lines origi- 
nated in the interior of a volume (Q) is justified by the following 
property: if the volume (&) is divided into several parts (Q), 
(Q2), . . + (Qn) with the help of some surfaces the total flux of a vector 
field A through the boundary surface of (Q) (in the outward direc- 
tion) is equal to the sum of the fluxes taken for each subdomain 
(24), (Qs), - - +» (Qn) (the proof of the property is left to the reader). 

The sources of a vector field can be concentrated at separate points 
or distributed over some surfaces or curves. They can also be distri- 
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buted in space (recall the general concept of a quantity distributed 
in space which was discussed in § 2). We first turn to the latter case. 


Here we can introduce not only the average density g of the source 


in (Q) [as before, the symbol Q designates the volume of the domain 
(Q)] but also the density of the sources of the field at any point M 
of space which is defined as 
ia | Ado 
oh jm AD a 56 
ae AQ ai AQ (a0) 
where (AQ) is a small volume enveloping the point M and (Ao) 
is the surface which bounds (AQ). 

This density is called the divergence of the vector field A and is 
designated as div A. Hence, we can say that the divergence of a 
vector field can be interpreted as the number of vector lines genera- 
ted in the interior of an infinitesimal volume (i.e. the flux of the 
field through the surface bounding the volume) related to unit 
volume. It should be noted that the divergence of a vector field 
is a scalar quantity. Moreover, it forms a scalar field because it 
assumes a certain numerical value at each point in space. 

Formula (56) can be rewritten in the form 
, divA=%, ie. dQ=divAdQ 
The last expression represents the number of vector lines issued 
from the element of volume (dQ). Summing together these expres- 
sions over a domain (Q) (see Sec. 4) we arrive at the formula for 
the number of vector lines coming out of the finite volume (Q) (that 
is for the flux of the field A): 


A-do= f div A dQ (57) 
(0) (2) 

where (Q) is any finite domain and (ø) is its boundary surface. This 
is Ostrogradsky’s formula which plays an important role in the vector 
field theory. It was discovered by Ostrogradsky (in scalar form) 
in 1826. The formula holds in all cases when the field A and its 
divergence div A do not approach infinity in (Q). It is also valid 
when the divergence approaches infinity in such a way that the 

integral on the right-hand side of formula (57) is convergent. 
The physical meaning of the divergence of a field depends on the 
nature of the vector field A. For instance, by Sec. 22, for the velocity 
field v of a gas flow div v is equal to the rate of relative expansion 
of an infinitesimal volume of gas and div (pv) is equal to the den- 
sity of mass sources. If the mass of the gas remains constant in the 
process of its flow we must have div (pv) = 0 (in the general case 
the mass can receive an increment, positive or negative, resulting 
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from a chemical or some other reaction in which the mass can chan- 
ge). At the same time we can have div v >Q, div v < 0 or div v = 
= (0 depending on whether the gas expands, contracts or does not 
change its density in the process of flow. If we take an electric field 
E its divergence, i.e. div E, is proportional to the density of a charge 
distributed in space and so on. 

If a field has sources distributed over curves or surfaces (its volu- 
me density must be discontinuous in such a case) we can speak about 
the line density or the surface density. In such a case we must add 
to the right-hand side of formula (57) the corresponding line and 
surface integrals taken over the curves and the surfaces carrying 
the sources which lie in the domain (). If there are point charges 
in (Q) the corresponding summands should also be added to the 
right-hand side of (57). If we understand the densities as generalized 
functions which were discussed in Sec. XIV.7 and Sec. 19 formu- 
la (57) will be true in all cases. 

If we take a plane vector field A the divergence is defined as 


f An dl fies 
i = im NRS 8 
(div A)u Neath Ao 
[formula (58) replaces formula (56) in this case]. Here (Ao) is a 
small plane figure enveloping the point M and (Al) is the contour 
bounding the figure. As is known (see Sec. 1X.9), a plane field can 
be interpreted in two ways. If the field is defined only in a given 
plane then, by definition, the numerator on the right-hand side 
of formula (58) is considered to be the flux of the vector field A 
across the curve (Al). But if the field A is originally defined in space 
and is regarded as a plane field because it does not depend on one 
of the Cartesian coordinates (for instance, on z) the numerator is 
equal to the flux of the vector field A through the lateral surface 
of a right cylinder [with base (Ao) and unit height] whose elements 
are parallel to the z-axis. In this case the denominator on the right- 
hand side of formula (58) is equal to the volume of the cylinder 
(why is it so?). 

Ostrogradsky’s formula for a plane field has the form 


ndl= \ divAd 
SA dl k o 


where (o) is a finite plane figure and (/) is its contour. 

The divergence of a field can sometimes be directly computed 
on the basis of its definition (56). For example, let us consider a 
centrally symmetric field in space which is defined by the formula 


Asje r 
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where r is the radius-vector of a variable point and f (r) is a given 
function of the modulus of r (see Fig. 324). Then the flux across the 
sphere of radius r is equal to 


Q()= f Ando (iaa: | £0) do=f (r) 4a 


and therefore the number of vector lines issued from a thin sphericał 
layer of width dr is equal to 


dQ = ånd [r*f (r)] = 4x [2rf (r) +-r°f' (r)] dr 


Consequently, we have 
: 2 H 
div A= f(r) +f (r) 


which is obtained from the above expression for dQ after it has been 
divided by the volume dQ = 4nr? dr of the layer. 

24, Expressing Divergence in Cartesian Coordinates. Let a Car- 
tesian coordinate system x, y, z be given in space. Then a vector 
field A can be represented in form (55). In this case we can deduce 
a simple formula for 
computing the divergen- 
ce div A. To do this we 


take into account the s 
fact that the particular 
0 
Y 
g 
Fig. 324 Fig. 325 


form of an infinitesimal domain (AQ) entering into the definition 
of a divergence [see formula (56)] is inessential. Therefore we can 
take a small rectangular parallelepiped with faces parallel to-the 
coordinate planes as this volume (see Fig. 325). Then the numera- 
tor of the fraction in expression (56) can be represented as a sum 
of six summands corresponding to the six faces of the parallelepiped. 
We now consider the sum of two summands corresponding to the 
faces (designated by I and II in Fig. 325) which are parallel to 
the y, z-plane and whose unit outer normals are denoted as ny 
and nı. We have (An)ı = — (Ax) and on the basis of Taylor’s 
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formula (see Sec. IV.15) we can write 

(A,)m = (An = (Ax)r + (0A x)1 Pes 
where the expression ôxAx = Se Az is the partial differential 
with respect to z which appears here because the points belonging 
to the faces I and II differ by Az in the values of their abscissas z. 
The dots on the right-hand side of the formula designate the terms 
of higher order of smallness which are not put down here. The inte- 
gration over these faces reducing to the integration over their pro- 
jections onto the y, 2-plane [i.e. over (AQ),], we have 


| An do+ f Ando= \f (An) dy dz + if (An) dy dz= 
( 


(í) (I1) (A2)x 2) x 
ae (= ), Az dydz+...= (S (Ce), dy dz) Arps 
= ( (42), du 42) Az+... = (=), Az Ay Ast Basie 


The dots here also designate the terms of higher order of smallness 
relative to the terms which are written in full. The subscripts T 
II and M mean that the corresponding terms are taken for the point 
belonging to the face I, II or for the point M, respectively. The 
inscription av means the average (mean) value. When performing 
the last transformation in the above formula we have used the for- 


mula 


DAEA ae ae) Oy chs 
a ie a „ t infinitesimal 


and in the preceding transformation we have taken advantage of 
the formula for the mean value of a function (property 10 in 


Sec. XVI.3). 
Performing similar calculations for the other two pairs of faces 


and summing up all the expressions we arrive at the formula for 
the flux through the whole boundary surface of the parallelepiped: 
aA aA aA, 
| A-do= OAs 4 Oly 4S) Ardy Ast... 
(Ao) 
In this case we have AQ = Ax Ay Az and therefore 


4 (94x , Ay , Az 
sa A-do= EE E A 
(Ao) 


+ 


oy ô: 
Passing to the limit we finally obtain the expression 


4 OA 0A 0A 
SS ee ola er a a (59) 
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We have not written the subscript M here because the formula holds 
for any point of the field. : 

We can similarly find the expression of a divergence for a plane 
field. The result must obviously be the same as (59) with the excep- 
tion that the last term entering into (59) should be deleted. 

25. Line Integral and Circulation. Let an oriented curve (L) 
(i.e. such that the direction of describing this curve is indicated) 

be given in the domain of space where a ve- 
(L) ctor field A is defined. Then we can form 
the line integral 


f Le f A,dL (60) 


(L) 
A 
taken along (L) where A, is the projection 
of the vector A on the unit vector t tangent 
to (L) drawn in the direction of describing 
the curve (L) (see Fig. 326). Since the ve- 
ctor dr goes along t and |dr |= dL (see 
Sec. VII.23) the expression for line integral (60) can be rewritten 
in the form 


Fig. 326 


I= | Acosa|dr|= f A-dr= f (Axdx+Aydy+Azdz) (61) 
(2) (LZ) (L) 

Line integral (60), or (64), is a scalar quantity possessing all the 
properties of line integrals (see Sec. XIV.6). If the orientation of 
the curve (L) is changed the integral is multiplied by —1. If the 
angle œ shown in Fig. 326 is acute at all the points of the curve (L) 
we have I >0 and if the angle is obtuse /< 0. Finally, J = 0 
if œ is a right angle at all points. We can also have J = 0 when 
the angle a is acute for one portion of the curve (L) and is obtuse 
for the other but varies in such a way that the integrals taken over 
these portions mutually cancel out. 

A line integral of type (60) has an obvious physical meaning when 
A is a field of force. As we showed in Sec. XIV.22, in this case the 
integral is equal to the work performed by the field when the point 
upon which the force acts describes the curve (L). 

If (Z) is a closed curve line integral (60) is called the circulation 


(in this case we can write the integral as § (Ax dx + A, dy + Az dz). 


ie 

26. Rotation. For our further ett we need the expression of 
a circulation taken along an infinitesimal closed contour (AL). To 
obtain the expression we assume that the vector field A is repre- 
sented in form (55), i.e. as being resolved into components along 
the unit vectors of the Cartesian axes. Let the contour (AL) be placed 
near a point Mo of space. Now we compute the integral of the first 
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summand entering into the right-hand side of formula (64): 
Ax 
§ An (ey 2)de= $ [Ut (FE), E 
(AL) (AL) 


+ (42) wm + (SE), e—a) +--+ |de (62) 


oy 


Here we have applied Taylor’s formula (see Sec. XII.6). The sub- 
script “zero” indicates that the corresponding quantities are taken 
at the point Mo, and the dots designate the terms of higher order 
of smallness. Recall that in Sec. XIV.23 we proved the formulas 
which can be written in the notation of this section as 


[C,+C,2] dx =0, 
(AL) 


y dx= — AS cos (n, z) and § zdx = AS cos (n, y) 
(AL) (AL) 
where C, and C , are arbitrary constants, AS is the area of the surface 
(AS) bounded by the curve (AZ) (this surface can be regarded as 
a plane figure to within infinitesimals of higher order) and n is the 
unit outer normal to (AS) whose direction is coherent with the di- 
rection of describing (AL) according to the right-hand screw rule 
(see Sec. VII.11). Hence we can write the relation of the form 


e [ (Ao a Ce ap (4 = %0)— or Yor a Ny “| da 
a § [C+ Cx] dx =0 
(AL) 
Substituting all these results into (62) we obtain 
si A, de =| ( 24s) cos (a, y)— (=), cos (a, 2) | AS+... (63) 


Evaluating the other two integrals on the right-hand side of formu- 
la (61) in a similar manner [to perform the evaluation it is suffi- 
cient to make two successive circular permutations of the coordi- 
nates in formula (63)] and summing up the results we deduce 


§ aa [CE 08 — (SH), a] 


(AL) 


4-[ (GE) a- (SP) cots] 
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cos (n, z) — 2z) cos (n, v) |} AS+...= 


he ( 
aA, dAy\ ~ a 
= i 7 ad Pa (= ye cos (n, z) + 
0A 0A, ais 
+[( ae Ox Jo | cosa, y)+ 
Gay Ze ar) 2h) 
+[ (42),— (GE), Joos dp AS+--- (64) 
To simplify the above expression let us introduce a vector (or, 
more precisely, a vector field) which is called the rotation (or curl) 
of the vector A (of the vector field A) and designated as rot A. The 
rotation is defined by the formula 
Lf oA, OAy\. Ax Az \ « dAy Ax ron 
rot A= (3 Oz )i+( 7 Ox ) i+ ( dz  ðy )k (65) 
In the general case vector (65) varies as the corresponding point 
moves in space. The vectors rot A form a new vector field in those 
parts of space where the original field A is defined. 


If we take into account that n is a unit vector we can resolve it 
in the form / 


+ 
— 
oj & 

N 
— 
o 


n= cos (n, x) i-+cos (n, y) j--cos (n, 2) k 
(see Sec. VII.9). Thus, we can rewrite formula (64) in a simpler form: 


GA -dr = (rot A)o-n AS +... = (rota A)o AS + .-- (66) 
(AL) 
The subscript n in the last expression indicates that the vector 
rot A is projected on the normal n, and the dots, as before, designate 
the corresponding terms of higher order of smallness which have 
not been put down here. It is formula (66) that expresses the cir- 
culation along an infinitesimal contour. 

Dividing both sides of formula (66) by AS and passing to the li- 
mit, as (AL) > M, that is as the contour (AZ) contracts to the point 
M, we obtain the expression 

q A-dr 
Beene (Ah) 1 
(rotn A)ar Mirae AS (67) 
where the meaning of the notation is quite clear. 

Thus, the projection of the rotation of a field on any direction n 
at any point M of space is equal to the ratio of the circulation of 
the field over an infinitesimal contour bounding a surface perpen- 
dicular to n to the area of the surface. This property implies that 
the rotation whose definition (65) is connected with the particular 
choice of a coordinate system is in fact invariant, i.e. independent 


MULTIPLE INTEGRALS 637 


of the choice, since the right-hand side of (67) does not depend on 
the choice. Thus the projection of rot A on any direction is uniquely 
defined which means that the vector rot A itself is uniquely specified 
at each point. 

Besides, formula (67) shows that the rotation of a field of true 
vectors is a pseudovector (see Sec. VII.14) because if the screw-rule 
is changed we must change the direction of describing (AL) (see 
Sec. VII.11) which results in changing the sign of the right-hand 
side of (67). 

Fig. 327 represents several simple examples of vector fields. The 
rotations of the fields which can be found by means of formula (65) 
or (67) are also put down in Fig. 327. To perform the computations 


ta tL We Ze 
Jag AAE AS 


pata A Ah Kp 
(a) (b) (c) 
Fig. 327 
(a) A = const, rot A = 0 
RE PS am Se les 


according to formula (67) we must choose (AL) in a convenient way. 
Let the reader perform the computations. The third example in 
Fig. 327 represents the field of linear velocities for the revolution 
of a rigid body about the z-axis (drawn perpendicularly to the plane 
of the figure) with constant angular velocity œ. We see that the 
rotation of such a field is a constant vector equal to the doubled 
angular velocity vector. Cauchy proved that when a continuous 
medium moves in an arbitrary way (we mean gas, liquid, an elastic 
or rigid body and the like) the motion of every small volume can, 
be represented at each moment of time as a superposition of several 
motions whose velocity fields are of the forms shown in Fig. 327 
(these are translatory motion, deformation and rotary motion). A 
nonzero rotation appearing only for a rotary motion, we see that 
the rotation of the field of linear velocities for an arbitrary motion 
of a medium is equal to the doubled angular velocity of the corres- 
ponding particle at each point in space. Of course, in the general 
case the rotation can be different at different points. Hence, if the 
rotation of the velocity field of a flow of gas or liquid is different 
from zero this indicates that there are vortices in the flow. This 
accounts for the term “rotation”. 
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The rotation of a plane field has a particularly simple expression. 
Indeed, if A = A, (x, y) i + Ay (z, y) j formula (65) implies that 
in this case we have 

Cady 23 0Ax 

27. Green’s Formula. Stokes’ Formula. These formulas enable 
us to transform the circulation of a vector field over a closed contour 
to a double integral over a sur- 
face bounded by the contour. 
Green’s formula is related to 
a plane field and Stokes’ for- 
mula deals with the general case 
of a spatial field. The former for- 
mula directly follows from the 
latter but we shall deduce 
Green’s formula independently 
because the deduction is very 


simple. 
Let us consider the circulation 
Fig. 328 of a plane field A = P (z, y) i+ 


+ Q (x, y)j over a closed con- 

tour (Z) which is described in 
the positive direction. The finite domain bounded by the con- 
tour (L) will be designated by (S) (see Fig. 328). By formula (61), 
the circulation can be written as 


r= Q P (z, y) de+§ Q(z, y)dy (68) 
(L) (L) 
For the first integral in (68) we obtain 
b a b 
| P(e uy de+ | P(e, v)dz=— | [P (a, yo) — P(x, vl dx (69) 
a b a 


(see Fig. 328). The expression under the sign of integration is a 
partial increment of P with respect to y which can be represented 
in the form of an integral of the derivative: 


a(x) 
P(x, yx)—P (a, y= | dy 
Pa(x) 
Substituting this expression into (69) we obtain 
2(x) 


F(T Baas Pacey 
a @ (8) 
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The second integral in (68) is transformed similarly (we leave the 
calculations to the reader). Adding together the results we arrive 
at Green’s formula: 


i a ôP 
| (P dr+0dy)= f (22-5) dS (70) 
È) (8) ; 
The formula is applicable if all the functions P, Q, be and Ld 


are finite everywhere in (S). In particular, formula (70) implies an 
assertion mentioned in Sec. XV.6 which states that if the condition 
£ = £ holds in a simply-connected domain (G) in plane the expres- 
sion P dæ + Q dy is a total differential in the domain. Actually, 
by formula (70), we have $ (P dx + Q dy) = 0 for any closed 
(L) 

contour (L) lying in (G) and hence the assertion follows from 
Sec. XIV.24. The condition that 
the domain in question should be 
simply-connected implies that for 
any contour (L) belonging to (G) 
the portion of the plane lying in- 
side (L) also belongs to (G) (which 
may not hold for a multiply-con- 
nected domain), the implication 
being applied to the deduction of (48) (L) 
the above assertion. 

We now proceed to deduce an 
analogous formula (Stokes’ formula) 
for a spatial field.* The formula discovered in 1854 by the English 
physicist and mathematician G. G. Stokes (1819-1903) is widely 
applied to the theory of vector field. Let a finite oriented contour 
(L) bounding a finite oriented surface (S) be given. Let the orien- 
tations of (L) and (S) be coherent as it is shown in Fig. 329. 


7 
Hy 


Divide (S) into small surfaces (AS,), (AS2), ..., (AS) bounded 
by the contours (AZ,), (AL), ..-, (ALm). The contours (AZ;) 
(i =1,..., m) are considered to be oriented according to the 


orientations of (L) and (S). Then we readily conclude that 


GAdr= >) $ Ad (71) 
(L) i=1 (AL;) 


because the integrals taken over the arcs entirely lying inside (L) 
and which enter into the right-hand side of (71) mutually cancel 
out (why?) and the sum of the remaining integrals just equals the 


* Can be omitted for the first reading of the book.—Tr. 
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left-hand side of formula (71). Regarding (ASj) (Ges Pena E) 
as being infinitesimal we can apply formula (66) to each integral 


A-dr which results in 
«AL;) 


§ A-dr= J} (rot, A): AS: +. 


(L) i=1 
where the subscript i indicates that the corresponding value of 
rot, A is taken at a point belonging to the ith area. The sum on the 
right-hand side is an integral sum (see Sec. 2), and therefore passing 
to the limit in the process when the linear sizes of all the subdomains 
are decreased unlimitedly we obtain 
§ A-dr= \ rot, AdS = f rotA-dS (72) 
(L) (5) (5) 


Thus, the circulation of a vector field over a closed contour is 
equal to the flux of the rotation of the field through a surface boun- 
ded by the contour. It is}formula (72) that is called Stokes’ formula. 


ef 
X o D 


(L) 


Fig. 330 Fig. 334 


The formula holds if the field A and its rotation are finite on the 
surface (S). It also holds if the rotation approaches infinity in such 
a way that the integral on the right-hand side of (72) converges. 

We note that a contour (Z) entering into Stokes’ formula can 
consist of several portions (components). In this case the contours 
must be oriented in a corresponding way (see Fig. 330). An analogous 
remark relates to Ostrogradsky’s formula (Sec. 23). 

In particular, Stokes’ formula implies the sufficiency of conditions 
(XIV.99) (see Sec. XIV.24 where the question was discussed) for 
integral (XIV.93) to be independent of the path of integration. 
Actually, let us introduce the field A = Pi + Qj + Rk for which 
we have rot A =Q if conditions (XIV.99) hold. Now taking an 
arbitrary closed contour (L) and an arbitrary surface (S) spanned by 
the contour we deduce, on the basis of Stokes’ formula, equality 
(XIV.95). The domain (G) of space in which the whole construction 
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is performed is supposed to be simply-connected which guarantees 
the existence of a surface spanned by the contour because if we 
contract the contour (L) within the above domain (G) in space to 
a point it describes a surface (S) of the desired type (see Fig. 334). 

28. Expressing Differential Operations on Vector Fields in a 
Curvilinear Orthogonal Coordinate System. We now consider a 
curvilinear orthogonal coordinate system À, u, v in space. It is 
natural to construct a system of unit vectors er, ey, ey tangent to 
the coordinate curves at each point of space and to resolve the vector 
fields in question with respect to these vectors. Thus we obtain 
a resolution of the form 


A = Axe, + Apep + Ayey 
at each point. 

To express the gradient of a scalar field u at an arbitrary point 
M we should recall that when evaluating the gradient according 
to formula (XII.2) we can place the Cartesian coordinate system 
in any way (see Sec. XII.1). Thus, we can put i = ea, j = ep and 
k=e,. This yields 
Oyu Oyu 1 du 1 du 


au ee er a tee eee 
Cut 3, = a n A op et ave 


Os), 


e+ 


grad u 


where l}, lą and ly are Lamé’s coefficients (see Sec. 15). 

When calculating the divergence of a vector field we cannot direct- 
ly apply formula (59) to a curvilinear coordinate system. For in- 
stance, if we put i = ea, etc. as above, the equality A, = A, will 


hold only at the point M (why?) and hence the derivative = can- 


not be found in such a simple way as above in the general case. 
Here we can apply the method which was used at the beginning 
of Sec. 24 in investigating the divergence. Let us consider the flux 
of the field through the surface of an infinitesimal rectangular paral- 
lelepiped bounded by the coordinate surfaces (see Fig. 332). Taking 
the sum of fluxes across the two faces perpendicular to the coordi- 
nate curve A (on which A varies whereas p and v are constant) we 
obtain, to within infinitesimals of higher order, the expression 
ô (luly Aa) 
ôx (An ds, dsy) = Ôa (luly Aa) dp dv = — zy dà du dv 


Computing the fluxes through the other two pairs of faces, adding 
together all the resalts and dividing by the element of volume 
dQ = laluly dd du dv we obtain 

A 4  pê(lulvAa) 3 (LalyAn) 3 (lalu Ay) 

div A = Tilly [ on a ETR —| 
To derive the expression for the rotation we must take advantage of 
formula (67). The circulation of the vector A over the contour of 
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an infinitesimal rectangle perpendicular to the vector e, (see 
Fig. 333) is equal to 


Ga-ar=({—{ J=( — J) Hau (Av ase) — 8s (Au a5) = 


NP MQ QP MN 
ô (yA 9 (lyAy) 
= ô, (lyAy dv) — ôy (dy Apap) = [Se --] du dv 


to within infinitesimals of higher order of smallness. Dividing by 
the surface element dS = luly du dv and performing circular per- 


ey 
P 
a 
ly dv 
a 
ludy “N 
Fig. 332 Fig. -333 


mutation of the indices we obtain the formulas 


1 [a(lydy) 9 (luáu) 
(rot A), = Th [ = v = ] 3 
4 pfôlhAn A (lvAv) 
(rot A)y = iy (jae ee | £ 
4 pod) (Ad) 
(rot Ay = 77 [= a | 


All these formulas are naturally simplified in the case of a plane 
field for which we must put Ay = 0 and ly = 1 and regard all the 
quantities involved as being independent of v. 

29. General Formula for Transforming Integrals. It turns out 
that the formulas of Stokes, Ostrogradsky and some formulas in 
a multiple-dimensional space analogous to them can be written 
in the form of a single formula which generalizes them all. Let us 
take the k-dimensional space Æp in which the Lebesgue measure 
is introduced (see Sec. 20). Consider an oriented (p + 1)-dimensio- 
nal (p = 1, 2, ..., k — 1) manifold (Q) with the p-dimensional 
boundary (Q’) lying in Er. The orientation of (Q) induces the cor- 
responding orientation of (’) according to the following rule: 
if a small (p + 1)-dimensional tetrahedron AAAs .- - ApyiApy2 
which belongs to (Q) and whose vertices are enumerated in accord 
with the orientation of (Q) is placed so that its face AAAs --- 
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+++ Ap+1 belongs to (Q’) this order of the vertices must correspond 
to the orientation of (Q’). [Let the reader check that if (Q) is a surface 
in the three-dimensional geometric space the above rule coincides 
with the ordinary rule of the coherence between the orientation 
of a surface and its contour.] 

We now consider integrals of form (54) where (S) = (Q’). The 
element of integration 

k 


o= > Ums, as m, (ES Sers te) dim = 2 At (73) 
Mi, M2, eor a P 
is a homogeneous function of degree p with respect to the differen- 
tials dt, ..., dt. It is called the differential form of degree p 
(p-form). 


There are some operations which can be performed on differential 
forms. For instance, we can add together forms of the same degree. 
By the way, expression (73) is a sum of the simplest forms which.. 
are monomials in d&, .... Differential forms can be multiplied. 
by one another under the convention that, according to the defini- ' 
tion of an integral of form (53), a permutation of two differentials 
entering into a monomial results in multiplying it by —1, and 
that if there are two similar differentials the monomial is conside- 
ted, by definition, to be equal to zero. A differential form can also 
be multiplied by a constant or by a function of 4, b, ..., te By 
the way, the latter can be regarded as a form of degree zero. The 
ordinary rules of addition and multiplication hold in this case 
with the exception that the multiplication of forms is non-commu- 
tative in the general case. 

A differential form is differentiated according to the following 
rule: 


do= A(S Umm, ma, s mp Fm +++ din ) = 


= J) dum, ...,m, tems +++ dtm, = 
Ou, 


Mi, +++, My 


y (a ) 
=>) a ge Ok dim ewe dim, 
where we must remove the brackets and combine the similar terms. 
We see that the differentiation of a differential form increases its 
degree by unity. 

It turns out that the above definitions imply the general formula 
for transforming integrals (54): 


p times (p+1) times 
———————n 
J ---Jo=J)... fao (74) 
(2°) (Q) 


We shall not give the proof of the formula here. 
41* 


644 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


As an example, let us take the case k =2, p =1. If we write 
az and y in place of & and ¢ and introduce the notation 


o = P (z, y) de + Q (a, y) dy 
we obtain 


oP ôP 
do = dP dz + dQ dy = (arta) dz-+ 
aQ aQ _ (.2Q ôP 
It follows that formula (74) turns into a formula which differs from 
Green’s formula (70) only in the notation of the domains of inte- 
gration. Let the reader consider the case k = 3, p = 4 (which leads 
to Stokes’ formula) and the case k = 3, p = 2 (which yields Ostro- 
gradsky’s formula). In treating these cases the reader should take 
into account the following expression of a flux in the form of a 
double integral with respect to coordinates: 
| A-do= J A-ndo= | [4s cos (m, 2) + Ay cos (ayy) + 
(9) (0) (0) 


+ 4zcos (n, 2)] do = f j (Ag dy dz-+ Ay dz dz + A, da dy) 
(0) 


(compare this expression with formulas given in Sec. 22). 


CHAPTER XVII 


. a 


Series 


We have already dealt with series in our course. We suggest that 
the reader should look through Sec. 111.6, where the basic defini- 
tions of the convergence and of the sum of an infinite series were 
formulated, before proceeding to study the present chapter. Here 
we shall give a systematic representation of the theory of series. 


§ 1. Number Series 


1. Positive Series. We now consider a series of the form 
Gg AE ag ne SE es (1) 


in which dn > 0 for all n = 4, 2, 3, .-..Sucha series with non- 
negative terms is referred to as positive series. As in Sec. III.6, let 
us denote the partial sums of the series as Si, Sea Oar ke 
In this case we have Si Sa <a S,<... (why?). There- 
fore, recalling the two possible ways of variation of an increasing 
quantity (see Sec, III.5) we conclude that series (1) is either con- 
vergent or properly divergent (i.e. divergent to infinity), its sum 
being + in the latter case. This can be written as 


8 


œ 
anco oF DA ap = © 
1 R=1 


= 
I 


respectively. : 
It should be noted that the first inequality is a symbolical expres- 


sion of the convergence of the series and it makes sense only for 


positive series. 
If besides series (1) we consider a series 


Bie byes la bik = (2) 


such that 
O< ar Sbr (k =4, 2, 3...) (3) 
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co oo 

D ar< Dd br 

KZA BSE 


Indeed, this is directly implied by the analogous inequality for 

the partial sums of the series. Thus, we arrive at the comparison 

test for the convergence of a series similar to the one given in 

Sec. XIV.15 for an improper integral: if condition (3) holds the — 

convergence of series (2) implies the convergence of series (1) and 

the divergence of series (1) implies the divergence of series (2). 
For example, the series 


1 1 1 
Bint m3 tart: 


converges’ which follows from the comparison of this series with 
series (III.6): 
1 1 
Bae oa (n=3, 4, mer) 


Although the first terms of the series do not satisfy the above ine- 
quality this does not affect the convergence (see Sec. III.6). 
The comparison test implies an analogous test: if 


an,>0, bk >0 (kK=4, 2,2.) and 


Fae © where C=const+0 and Cso 


series (1) and (2) converge or, respectively, diverge simultaneously. 
Actually, the above condition implies that the ratio = lies between 
Some constant positive limits m and M for all k: 


m<> <M, ie. mbr <ar <Mbr 


Now, summing with respect to k from 1 to n and then passing to 
the limit for n —> œ, we obtain 


m% h< È ar <M $ br 


which implies our assertion (why?). i 
Now let us formulate D’Alembert’s test which is sufficient for the 
convergence of series (1) and is widely applied: if the limit 


lim “#4 =4 


Nn- 
exists for series (1), the series converges in the case 1< 1 and di- 
verges in the case 1 >1. The latter assertion is obvious. Indeed, 
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if 1 >4 the ratio “ait (which approaches / as n increases) becomes 


n 
greater than unity for sufficiently large n. Hence, the terms of the 
series increase together with n when n becomes sufficiently large 
and therefore the necessary condition for the convergence of a se- 
ries (see Sec. II.6) is violated. Now let us turn to the case pæ: 
Choose a constant number }' between land 1. The ratio approaching 


l indefinitely, we have TS Y for n> N where N is a fixed 


number. Thus, we obtain 
aN+1 , aN+2 , aN+3 , 
ay PE mE an+2 alee: 
which implies 
ayy < anl, ane < ayal < anl’, 
AN+3 <= aynl <= aynl’? etc. 
Hence, after some number N, the terms of series (1) are smaller than 
the corresponding terms of the series 
ay + anl + ayl’” + Wet eee 
The quantity l' satisfying the inequality 0< V <1, the latter 
series is a geometric series (progression) with common ratio l’ < 1 
which converges [see series (I11.7)]. Hence, by the comperison 
test, series (1) also converges. 
Let us consider an example. To apply D’Alembert’s test to the 
series 


oo 


Bie ia 0,. yo. or P< 0) (4) 
n=1 
it is necessary to consider the limit 
+1 an A a 
lim (fae) TP 
ees (n--1)P * nP mi HNP 
n n= (1+ - ) 


Hence, series (4) converges for a< 1 and diverges for a >l 
In the case a = 1 D’Alembert’s test does not enable us to find out 
whether the series converges or not. 

When D’Alembert’s test fails to give the answer it is sometimes 
possible to apply Cauchy’s integral test which is more general. It 
gives the following condition sufficient for the convergence of a 
positive series: if the expression a, can be defined not only for the 
integral values of n (n = 1, 2, 3, -..) but also for all real n > 4 
and if a, decreases when n increases series (1) and the integral 
© 


fan dn converge or diverge simultaneously. To prove the test let 
1 
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us establish the inequalities 


i andn< Jjan< | andn+ay (5) 
1 


n=1 1 


which directly imply the above assertion according to the compari- 
son test. To obtain (5) we write, on 
the basis of Fig. 334a, the inequality 


N 
| an dn<aye4 +ay+4 + cas tied (6) 
4 
Similarly, Fig. 334b shows that 
N 


| andn> ay-4+ 05-4 + wee +ay4 
1 


and hence 
N 
NS 3 ; M+ az+ a+... +ay< \ andn+ a 
Sis j 
(7) 
SA If we pass to the limit as N > oo in 
Fig. 334 inequalities (6) and (7) we just arrive 
at (5). 
As an. example, let us take the series 
SR 
A (8) 


n=1 


which is a particular case of series (4) for a = 1. D’Alembert’s test 
does not give the answer whether the series converges because in 
this case we have lim = 1. But the integral test is applicable 
N->0o 

here. Indeed, if we consider a, = n” as a continuous function of 
n for all Teal values of n (exceeding unity) we see that it satisfies 
the conditions of Cauchy’s integral test for p +0 and therefore 
series (8) and the integral 

r 4 

| ar an 

i 
converge or diverge simultaneously. 
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In Sec. XIV.15 we showed that the last integral converges only 
when p >1 [see the computation of integral (XIV.51)]. Thus, se- 
ries (8) converges only in the case p >1. In particular, for p = 1 
we obtain the so-called harmonic series 


CPS PEE t+ ..J=00 


Formulas (6) and (7) make it possible to estimate a partial sum 
of a divergent series which enables us to derive an asymptotic for- 
mula for such a sum depending on its number. We can similarly 
estimate any sum of a large number of summands which monotoni- 
cally depend on the number. To specify the result we can separately 
sum up a number of the greatest summands in a direct manner and 
apply the above method of estimation only to the remaining sum- 
mands because this will decrease the difference between the upper 
and lower bounds entering into a formula of type (5). 

More accurate approximations can be obtained by applying for- 
mulas of numerical integration (see Sec. XIV.13). For instance, 
let us illustrate the application of Simpson’s formula to an appro- 
ximate computation of the sum 

Sma = soba be tap (m=1, 2,...5 N>m+2) 
To do this we write, on the basis of formula (XIV.39) taken for 


h = 1, the relation 
RA R42 R42 


j e be re ) Sax (ttrt) 


Adding together these results for k=m, m+1,...,N—1 
and performing some simple transformations and the integration 


we obtain the approximate equality 
InN—Inm-+In (NV +1)—In(m+1) æ% 


4 5 1 4 al 
~g (6Sm.y— et wer wt wee) 
This implies 
6m-+-5 1 
Sm, n nN+eeg (m? +m) + 
1 4 4 
eae array (9) 
Fig. 334a obviously indicates that for the function a, = = there 


exists a finite positive limit of the form 


N-o 


N 
3 4 . 
C= lim [+O dn | = lim (S, y—InN) 
00 ji 
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which is called Euler’s constant. Equality (9) and formula Sm, y = 
= Si, N — + — + =... — Lh imply an approximate expres- 
sion for Euler’s constant of the form 
ERNE 1 6m+5 1 F 
Cary T ret aci cr 6m (m +1) —z ln (m? +m) 

whose accuracy increases with the growth of m. For m = 1 and 
m = 2 we get the approximate values 0.570 and 0.576, respectively, 
of Euler’s constant. The calculations show that C = 0.5772 1o 
within 10-4, and thus the above values are accurate to two decimal 
places. 

For greater detail on the problem of computing sums by meaus 
of integrals the reader is referred to [51] (see §§ I.2 and III.4). 

2. Series with Terms of Arbitrary Signs. We now turn to series 
of the form 


a + ag... fan +... (10) 
whose term can be of any sign. Let us form the positive series 


Jag | +lazl+---+lalt+--- 


with terms equal to the absolute values of the terms of series (10). 
We assert that if 


È) | an |< 00 (14) 
k=1 


series (10) is convergent. In this case series (10) is said to be abso- 
lutely convergent. The proof is quite similar to that of an analogous 
property discussed in Sec. XIV.15 and we leave it to the reader. If 
series | a | + |a,|+...+|a,|-+... diverges series (10) may 
nevertheless converge. In such a case we say that series (10) is con- 
ditionally convergent. 

When we are given a general series of form (10) we can apply the 
tests given in Sec. 4 to the series of the moduli of its terms and thus 
investigate whether it absolutely converges or not. For instance, if 


lim Jens |_ < 4 
n- | n | 
the series |a,|+|a,.|+...+]|a,|+.-. converges accor- 


ding to D’Alembert’s test and hence series (10) converges absolutely. 
If the limit exceeds unity the necessary test for convergence of a 
series is violated and thus series (10) diverges. 


The following test is referred to as the Leibniz test. It is applied 
to the so-called alternating series, i.e. series of the form 


a; — a, +a3;—a,+..- (12) 
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where a, >0 (k = 1, 2,...). The Leibniz test asserts that if 
a, >a, >a3 >... and lim a, =Q series (12) converges. To 


prove the test let us mark the points corresponding to the numeri- 
cal values of the partial sums of series (12) on the S-axis (see 
Fig. 335). Then, considering the transitions from 0 to S4, from S; to So, 
from S, to Sete. we see that every subsequent transition is performed 
in the direction opposite to that of the preceding transition and at 
the same time the corresponding distances between the points S» 


0 Sz Sy S3 S; 
li \__a¢ 8 
| a 

ar 

Fig. 335 


and Sry (k = 1, 2, ...) decrease. Thus, we have 0 < S< Sj, 
Sa < S3 < Si, Sa Sa < Ss, ... (see Fig. 335). Consequently, 
the partial sums with even numbers form an increasing bounded 
sequence and therefore they have a finite limit S’ (see property 10 
in Sec. III.5). Similarly, the partial sums with odd numbers form 
a decreasing bounded sequence and have a limit $”. Taking the 
equality Son, = Son + dons, and. passing to the limit as n —> oo 
we obtain S” = S’. Hence, all the partial sums have the same limit 
and thus series (12) converges. Incidentally, we conclude that the 
suni of series (12) lies between any sum with an even number and any 
sum with an odd number which enables us to estimate the sum of the 
series. 
For example, the series 


æ 


—4)n-1 
ea 


n=1 


satisfies the conditions of the Leibniz test for p >0 and therefore 
it converges for such p. For p >1 the convergence will be absolute 


1 


because, as we know, the series D 5p converges for p >1 (see 


n=1 


cim 


Sec. 1). But in the case p <1 the series $} -p ÍS condi- 


n=1 
tionally convergent, i.e. it is convergent but not absolutely con- 
vergent. 
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In conclusion, after passing to the limit in the inequality 


m n 
|È an|< > lal 
kt ht 
we see that if series (10) converges it satisfies the inequality 


[Š al< Š lal 


3. Operations on Series. 


4. Convergent series can be termwise added together (or subtrac- 
ted), that is if 


Gaps E 8 and 


Ga Ogee Once e = T 
we have 


(ay + by) + (ag tb) a F(t bn) +... =SHT 


To prove this we must take the obvious expression Pp = Sn + Tn 
of the sum of the latter series and then pass to the limit as n — °°. 

This property enables us to perform the following transformation 
of a series. Suppose we are given an absolutely convergent series 
with terms of arbitrary signs. Let us put it down in the form 


a—b—etdtetf—gt+..-=§ (13) 
where all a, b, c, ... are positive. Let us form the positive series 

atO0+0+d+tetf+O0+..-=5 } 

O+b+c+0+0+04+g+...=82 


Here we have separately added together all the positive terms and 
all the absolute values of the negative terms of the original series. 
Then S is equal to the difference Sı — S, because we can term- 
wise subtract the second series from the first one. This operation 
can be performed only on absolutely convergent series because 
both series (14) diverge for a conditionally convergent series of 
form (13) (why?). In the case of a conditionally convergent series 
the partial sums of both series (14) tend to infinity and the conditio- 
nal convergence of series (13) is due to the “balance” between these 
infinities, i.e. the difference between the partial sums tends to zero 
although the sums themselves are infinitely large. 
The next property can be verified in a similar way. 


2. A convergent series can be multiplied termwise by a constant 
factor, i.e. if 


(14) 


á 4G? eae tn eS 
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we have 


ka, + kag +... + kan +... =kS 


3. We can arbitrarily group the terms when summing a conver- 
gent series; for instance, if 


a + as + as ta tata tatat...=8 (15) 
we have 


(ay + an) + a3 + (as + as + ae) + (az +a) +... =S (16) 


Indeed, if the partial sums Sis So, Ss, ... of the former series tend 
to S, the partial sums of the latter series which respectively equal 
So, Sian Se Og ae ase tend to S. 


If series (15) properly diverges and we have S = œ the same is 
true for series (16). But if (15) is an oscillating divergent series (see 
Sec. III.6) series (16) can diverge or converge and its sum will de- 
pend on the way of grouping the terms, i.e. on the way of bracketing. 
For instance, for series (111.9) we have 


4—)+0—)+U—f)+--.=0+0404+...=0 


and 
L PEAD tt )+--.=1404+04...=1 


Before the difference between convergent and divergent series was 
understood the above fact had been thought of as an inexplicable 
paradox. The modern definition of the sum of a convergent series 
was formulated by Cauchy in 1824, after the theory of limits was 
created, although series were widely used as early as the 17th and 
48th centuries. 

4. We can arbitrarily rearrange the terms in a positive series 
without affecting its sum. 

Indeed, if we arbitrarily change the order of terms in a positive 
series (without omitting any of them) and take successive partial 
sums of the new series, any term of the original series will enter into 
all the sums with sufficiently large numbers. Consequently, any 
partial sum of the original series will be a part of a partial sum of 
the new series having a sufficiently large number. This implies that 
the limit of the partial sums of the original series, i.e. its sum, 
does not exceed the sum of the new series. The original series can 
be obtained from the new series by rearranging the terms of the latter 
and therefore the same argument shows that the new sum cannot 
exceed the old one. Hence, these sums -are equal. 

An absolutely convergent series with terms of arbitrary signs can 
also be rearranged in an arbitrary way without affecting the sum. 

Actually, as it was shown in property 4, an absolutely convergent 
series can be represented as a difference of two positive convergent 
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series. Hence, any rearrangement of the terms of the original series 
reduces to a rearrangement of the terms of these two series which, 
as it has been shown, does not affect their sums. 

Rearranging the terms of a conditionally convergent series we 
can make it converge to any sum and even make it diverge. The 
thing is that the partial sums of the positive and negative terms 
[see (14)] of a conditionally convergent series are in a balance in 
the sense that their rates of growth are of the same order. When 
rearranging the terms of such a series we can change the relation 
between these rates which can lead to the above result. At first 
glance this fact looks like a paradox. ‘We shall illustrate what has 
been said by giving a simple example. 

Take the series 


$ il 1 1 1 1 
ees a tar 5 


According to Sec. 2, its sum lies between the limits S = 0.5 and 
S, = 1. By property 2, this implies that 


1 1 1 1 1 S 
nrg OR ae a 
and therefore we also have 
1 1 4 4 1 1 S 
tatra tta 
Adding termwise the last series and series (17) we obtain 
elpene ly sta 4 1 1 1 4 3 
a a e et: TS 
that is 
ETNA 4 £ 4 1 1 4 3 
Sadia haere E R aera Ae = 


But the above series can be obtained from series (17) by rearranging 
the terms of the latter (check it up!) and hence we see that the sum 
has been changed. 

Thus, when dealing with processes connected with a rearrange- 
ment of the terms of a series we can treat the absolutely convergent 
series as if they.were finite sums. At the same time we must cautious- 
ly perform such operations on conditionally convergent series. 

4. Speed of Convergence of a Series, In practical computations 
of the sum of a series we usually compute the sum of several terms 
suppressing the others when there is no reason to think that these 
terms can essentially affect the sum (compare with the computation 
of number e in Sec. IV.16). For this method to yield a good result 
it is necessary that the series in question not just converge but con- 
verge fast so that we could exhaust almost the whole sum when 
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taking a small number of terms, that is obtain a result approxi- 
mating the sum with a sufficient accuracy. If a series converges 
slowly it is usually inapplicable to practical computations. By 
the way, it is sometimes possible to obtain a new series by perform- 
ing some operation on the original series or to derive an approxi- 
mate expression for its remainder by applying integrals as it was 
done in Sec. 1. Conditionally convergent series usually converge 
very slowly (see Sec. 2). But there are also absolutely convergent 
series which converge slowly. 

The speed of convergence of a series is essentially dependent on 
the rate of variation of its general term when it tends to zero, as 
the number n increases. Series whose general term a, is of the order 
of n-P li.e an = O(n-”); see Sec. III.11] for p >1 usually con- 
verge slowly. The greater p, the better the convergence of such 
series. Series with terms a, of the order of q”, 0< q < 1, converge 
faster. Their convergence can be compared with that of a geometric 
series of form (III.7), and the smaller g, the faster the convergence. 


$ Ka 
The convergence of series whose terms a, are of the order of = is 


n 
still better etc. 

Of course, what has been said represents only general considera- 
tions concerning the speed of convergence, and in a concrete case 
not only the behaviour of the general term for n — oo can be essen- 
tial but also the character of the first terms of the series. 

When we are given a slowly convergent series 


a F aat. Hante (18) 
we can try to pass from it to a series which converges faster. One 
of these methods is as follows. We choose a series 

Be ba s On a 0 


whose sum o is known so that a, ~ bn for n— co (see Secs. II.7, 8). 
Then we have an = bn + Yn Where [Yn | < lan |, and therefore 
series (18) can be represented in the form 


(by +y) 4 (ba + Y2) 4 ee = (by tbo t.-) t Mi Ft.) 
eee ea oe ee aea. 


and the general term of the latter series tends to zero faster than 


that of the original series. 

To apply the above method we must have a set of series with 
known sums at our disposal. Geometric series (111.7), series indica- 
ted in Sec. IV.16, some combinations of these series and the series 


œ 


Dto) (>t) (19) 


n=1 
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are most frequently used for this purpose. The latter sum (depen- 
dent on p) is called the Riemann zeta function after the prominent 
German mathematician G. F. B. Riemann (1826-1866) although 
it was Euler who was the first to introduce the function in 1737. The 
tables of the values of the function can be found in [23]. 

For instance, let us take the series 


re if 
Its terms are equivalent (as infinitesimals) to the terms of series (19) 
for p = gasne ow. Hence, series (20) converges but very slowly. 


Taking advantage of inequalities (5) we can readily verify that the 


remainder after n terms of series (19) is equivalent to pija. 
1 

i.e. the remainder of series (20) is of the order of 2n 2. Thus to 
obtain its sum S$ with an accuracy of 0.01 we must take about 
40,000 terms! But here we can apply the above method and write 


4 4 
Veit V Fyn 
where 
n EES SS ee r N 
"= Vei Ves VEVE iVE Ve) 


Consequently, series (20) can be represented in the form 


aS 1 
(a) 2 rnv 
The tabular value of the first summand is equal to 2.612, and the 
9 
general term of the latter series is equivalent to (2n2)-1. Hence, 
7 


the remainder of the last series is of the order of (772)-* which means 
that to obtain its sum to within 0.01 it is sufficient to take only 


three terms! If a greater accuracy is needed we can repeatedly apply 
the method which yields 


Ste YES) 4 9 
S=¢(5)—zt(z)+ 
i And Tns —2n3—1 
ET 2 Vn3A (V në +4) (B3414 Vr A) (2 Vn? A3- 2n9— 3n3— 1) 


(check up the result!) where we have put, for brevity, V% +1 = A. 
The sum of the tabular values of the first two terms is equal to 
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13 
2.085, and the remainder of the last series is equivalent to 53 Bah, 


Hence, to obtain S with an accuracy of 0.001 it is sufficient to take 
only two terms. 

The above successive application of the method can be perfected 
if we apply Taylor's series ([V.60) (the binomial series) to the expres- 


sion 
Vitti 


3 
-> 1 3 SoM 
=n (1—g5+-ar aoe +--+) (22) 


Suppressing the terms of series (22) following after any term we 
obtain the corresponding approximation. For instance, dropping 
the terms after the third, we obtain 


3 
1 -4 1 3 
VOA A ; (1— Ind E Ba )— 3a (23) 


which yields 
; : é 5 48 æ% æ 
S=t($)—z8(z) +58 (F)— D r=2462— J yn 
n=1 n=1 
The terms Yn can be found from (23). On ithe basis of (22), we con- 


5 


clude that they are equivalent to pr” 


Many other series can be transformed in a similar way. Besides 
(19) we also use the series 


2 Ca g see EF =% (p) (1— oh ) » (24) 
pa a ~ 4 (+ H ) = 
a eee 


a 1 
x natat 2 
n=1 a 


4 
n(n +1) n+) (n-2) ]= 


Me 
an, | ee ) 
a. 


42—0144 
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Formula (24) also holds for 0<p<1 but the function ¿ (p) 
cannot be defined by formula (19) in this case since the series di- 
verges, and it is therefore defined by means of another method which 
we shall not discuss here. Formula (IV.61) implies that, for p = 4, 
the left-hand side of formula (24) is a convergent series whose sum 
is equal to In 2. 

When we deal with alternating series it is usually very difficult 
to guarantee the desired accuracy in practical computations. For 
example, let us take series (IV.56) for the cosine putting z = 100: 

400? 1004 4006 1008 c 
cos loaa a (25) 

The series on the right-hand side of (25) is convergent and even 
absolutely convergent (why?). But we cannot use it for practical 
calculations. The thing is that although its terms beginning with 
the 54st decrease sufficiently fast so that the theoretical convergence 
is guaranteed they become enormously large before that. The whole 
sum does not exceed unity in its absolute value and hence all these 
large terms must almost completely mutually cancel. As we know 
from Sec. 1.9, in such circumstances, in order to achieve the desired 
accuracy, we must carry out all the calculations with a great number 
of significant digits and hence perform much unnecessary work. 
Therefore we should avoid using series of type (25). If such series 
occur we must transform them to other series convenient for prac- 
tical calculations. For instance, in the above example we can take 
advantage of the periodicity of the cosine and pass to a considerably 
smaller value of the argument. 

5, Series with Complex, Vector and Matrix Terms. The definition 
of the convergence and of the sum of a series with complex terms 


zaot Za F aee F Zn Herer Zn = In t+ iyn (n = 1,2, 3, . . -) (26) 


is completely the same as that of real series (see Sec. III.6). Seri- 
es (26) is usually reduced to two series of the form 


atte t.e tinte. and eee eae. Un Fo - 22) 


If both series (27) converge and have the sums z and y, respectively, 
series (26) also converges and has the sum 2 = x + iy. If at least 
one of the series (27) is divergent series (26) is divergent as well. 
Series (27) having real terms, we can apply the methods of Sec. 2 
to them. 

The following test is also of use: if 


È la| < o (28) 


both series (27) are absolutely convergent and therefore series (26) 
is also convergent. If (28) holds series (26) is said to be absolutely 
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convergent. The methods of Sec. 4 can be applied to a series satis- 
fying condition (28). 
A series of the form 


utu +t... +n t+... (29) 
whose terms are vectors is treated in like manner. If all the vectors 
u, (n = 1, 2, ...) belong to a three-dimensional space we can pass 


to the corresponding series with scalar terms by projecting series 
(29) on the z, y and Z-axes. 
We can also consider a series of the form 


SACE Agee iit plage fet (30) 


where the terms An (n = 1, 2, ...) are matrices. For series (30) 
to converge it is necessary and sufficient that each of the series 
formed of the corresponding elements of the matrices (Sec. XI.2) 
(that is of the elements standing at the intersections of the same 
rows and columns of the matrices) should converge. 

The properties of series (26), (29) and (30) are the same as those 
of real series (see Sec. 3). 

6. Multiple Series. Finite sums can have more than one index 
of summation. 

For instance, 

AE, 


x pa Ay + l2 + lis + a1 -H a22 F Gag, 
i=1 j= 
4 4 { 
a 4 4 1 4 1 1 al 1 1 
Lae etatetatatetatatete 


and the like (compare with Sec. XVI.8). 

Infinite series can also have more than one summation index; 
these series are called double, triple etc. and, generally, multiple. 
We shall restrict ourselves to investigating a double series of the 
simplest form 


y 5 aij (31) 


The series of higher multiplicity and also the series having variable 
summation indices in the inner sum (the latter finite sum belongs 
to this type) are treated similarly. 

Let us first suppose that all a;; > 0. We arrange all the terms 
of series (34) in an arbitrary order and form an ordinary series. 
For instance, we can arrange them as follows: 


ay, + 42 + aa + lis + a22 + a31 + au + a + 
+ azz + ayut.. (32) 
42* 
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It is the sum of this series (which does not depend on the order of 
the summands; see property 4 in Sec. 3) that is called the sum of 
series (31). There can be two cases here, the convergence and the 


divergence, i.e. 


8 


œo œ 


either >) >) ayo or x 
jZ le 


arj = 90 
i1 j 1 


ll 


j 


Hence, for a;; > 0, the sum of series (31) is independent of the 
order of summation but of course we must not omit a single term 
when forming a series of type (32). In particular, we can perform 
the summation in the following ways: 


> 3 aij= 5 ($ aj) = > ($ aij) (33) 
=1 j=1 isi j=1 j=1 i=i 


If the terms a,j are of any sign or are complex numbers, the se- 
ries satisfying the condition 


D i laij| < œ (34) 
i=1 j=1 
represent the simplest case in which we call the series absolutely 
convergent. If condition (34) holds the series (31) also converges 
and its sum can be computed by means of any formula of form 
(32) or (33) or with the help of the formula 
. oœ li M N 

a;;= lim ; 

z4 E M, N> j=4 a 


Ms 


4=1 3: 


and the like. If condition (34) is violated the result of a summation. 
of series (31) may depend on the order of the terms (see Sec. 3), 
and then we have a more complicated case. 

In particular, a double series is obtained when we multiply two 
absolutely convergent Series 


Sı= >a and S= X b= > by 
i=1 i=1 j=1 


Before performing the multiplication we have changed the notation 
of the summation index in one of the series. The multiplication 
can be performed as follows: 
SS= Ya Xb X (u Yd) = D(D adj) = DY D avd; 
i=1 j= i1 j= i=1 j= i=1 j=1 

When writing the last equality we have applied formula (33) because 
the series are absolutely convergent. Thus, series of this type are 
multiplied according to the rule of multiplication of finite sums 
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(i.e. each term of the first series is multiplied by each term of the 
second series and so on) which results in an absolutely convergent 
double series. 

An analogous result is obtained when we multiply a greater num- 
ber of absolutely convergent series. 


§ 2. Functional Series 


7. Deviation of Functions. If the terms of a series are not numbers 
(as in § 4) but functions there appears a question in what sense we 
must understand the fact that the partial sums which are functions 
approach the sum of the series (which is also a function), in the case 
of convergence. Hence, the question is how to estimate the deviation 
of one function from another. 

It turns out that we can do this in different ways which are not 
equivalent to one another whereas the deviation of two numbers 
a and b is always characterized by the quantity lļa—b |. j 

Let us be given two functions f (x) and @ (x) defined in the same 
finite interval a <z <b. The quantity 

max | f(x) — ọ (2) | (35) 


a<x<b 


is called the maximal (uniform) deviation of the functions f and 
p from each other. It is also sometimes referred to as Chebyshev’s 
deviation. The geometric meaning of the quantity is illustrated 
in Fig. 336. This notion can be applied only to bounded functions. 
As a rule, we use it when continuous functions are considered. If 
the uniform deviation of two functions is small the difference bet- 
ween the values of f (z) and @ (z) is small at each point x of the 
interval a Kx <b and vice versa. 
The quantity 
b . 
f 1f@—9@)|4z (36) 
a 
is called the mean deviation of the functions f and @ from each 
other. Its geometric meaning is implied by Fig. 336: the quantity 
equals the area shaded in the figure (taken in its absolute value). 
The so-called mean square deviation is defined as 


IRITE hans 
V J [f (2) —9 (a) de aa 


which is analogous to (36) in many respects but is more convenient 
for calculations. Deviations (36) and (37) are used not only for 
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continuous functions but also for unbounded ones if the integral 
(which is improper in this case) is convergent (Sec. XIV.16). There 
are some other forms of deviation which are also of use but we do 
not discuss them here. 

If we replace the difference f (z) — @ (x) entering into formulas 
(36) and (37) by maximal deviation (35) the integrals can only 
increase and hence we obtain 

b 
f [f@)—9@|ax<(o—a) max | f (2) —9 (| 
asx<b 


Se oe eee (38) 
and V fire orde < VTE max |f(2)— 012) 


a 
Consequently, if the maximal deviation of two functions from each 
other is small their mean and mean square deviations are also 


Y 


y=fa) 


Fig. 336 Fig. 337 


small. At the same time it can happen that the maximal deviation 
of two functions is large whereas their mean deviations are small. 
The possibility is illustrated in Fig. 337. 

8. Convergence of a Functional Series. We now consider a series 
of the form 


fiz) +fe@) +... +i @+... (39) 


whose terms are the functions fp (z) (k = 1, 2, ...) defined over 
the same finite interval a < z < b. 

We say that the series converges to a function S (x) on this inter- 
val which is called the sum of the series if the deviation of the par- 


tial sum S, (x) = Y) fx (x) from S (zx) tends to zero as n increases. 


= 
Depending on the form of the deviation (see Sec. 7) we speak about 
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the type of the convergence of series (39). For instance, if 


max |S (z)—Sn (x)|—— 0 
a<x<b n= 
we say that series (39) uniformly converges to its sum S (z). Simi- 
larly, we say that the series converges to S (x) in the mean or in 
the mean square* depending on whether we have 


b 
J 1S @)—Sn (2) lde- 0 


a 


enero 
yi [S (2) —Sn (x) de —— 0 


a 


Inequalities (38) indicate that if series (39) converges uniformly 
it also converges in the mean (of any order) to the same sum. The 
converse statement may not be true in the general case. 

If series (39) uniformly converges to the sum S (z) on an interval 
a<z<b we have 


AO that- thA. SS 


for each number c belonging to the interval. Actually, the difference 
between the value of the nth partial sum of the series at the point c 
and S (c) does not exceed the maximal deviation of S, (x) from 
S (x) (why?) and therefore it tends to zero as n —> oo. This property 
makes it possible to obtain numerical series with known sums from 
a functional series when its sum is known. 

To test a series for uniform convergence we usually apply 
Weierstrass’ test whose condition is sufficient for the uniform con- 
vergence: if all fn (z) (a < z < b) satisfy the inequalities 


or 


fn (2) | Kan (n=1, 2, 3, ...; aKT <b) and Èi an<% (40) 


* Generally, we say that a series with partial sums Sp (z) converges in the 
mean of order p to its sum S (z) on an interval a < z < b if 
1 


lim ({ | S (2) —Sp (2) [P az)” =0 


n= 


Hence, in this English edition the term “convergence in the mean” is understood 
as “convergence in the mean of order one” and the term “convergence in the 
mean square” as “convergence in the mean of order two”. When the term “con- 
vergence in the mean” is used in mathematical books without qualification 
it is sometimes understood as “convergence in the mean of order two” and some- 


times as “convergence in the mean of order one”.— Tr. 
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series (39) converges uniformly. To prove the assertion we take 
advantage of the comparison test (see Sec. 1) and conclude that 
series (39) absolutely converges (as a numerical series) to a sum 
S (x) for each fixed value of x. At the same time 


a<x<b k=n+ 


max |S (x)— Sn (x) |= max | D fee) |< 
asxsb 1 


<max > lf(@\< È anr 
asx<bkh=n+1 k=n+1 
Since the last sum is the remainder of a convergent series (see 
Sec. III.6) it tends to zero as n — oo. 
Condition (40) can also be written as 


>) max | fr (z)|< 0 

n=1 a<x<b 
because the terms of the latter series can be taken as a,. The tests 
for convergence of series (39) in the mean or in the mean square 
are similar to the above test. The conditions which are sufficient 
o these types of convergence are put down, respectively, in the 
orms 


D | Ifa (2) | dz < 00 and FV iade 
n=1 a n= a 


but we shall not give the proof here. 

There are cases when series (39) is divergent on the whole interval 
a< z< b but convergent on some subinterval a, <x < b, for 
which a < a, < bı < b. The interval a, <z < b, is called the 
domain of convergence of series (39). 

In conclusion we note that an arbitrary variation of a finite num- 
ber of terms of series (39) does not affect its convergence or divergence 
(although this can change its sum). This property is analogous to 
the corresponding property of number series (see Sec. II.6). 

9. Properties of Functional Series. 

1. The sum of a uniformly convergent series whose terms are 
continuous functions cannot have discontinuities. Indeed, if 


A@+f@)+-..-+th@+...=S@ (41) 
(@<z<b) 
we have 
S (2) = lfi (2) +. + fn (£) + Mna (2) + fns (2) + ++ de 
= Sn (£) + Rp (2) (42) 


If the terms of the series are continuous functions, S, (x) is also 
continuous as a sum of a finite number of continuous functions (see 
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Sec. III.14). On the other hand, if series (41) is uniformly conver- 
gent its remainder R, (x) will be arbitrarily small over the whole 
interval a <x <b for sufficiently large values of n. Therefore 
a small variation of x yields small variations both of S, (x) and 
R,, (x), and thus the whole sum (42) gains a small increment as well 
which means that the sum cannot have discontinuities. 

We sometimes consider series of form (41) on a finite or infinite 
interval a< xz < b which uniformly converge not on the whole 
interval but only on each proper subinterval a, <2 < b, lying 
entirely in the interior of the former interval. Then we can apply 
the above property to the interval a, < x < b, and then, making 
a, approach a and b; approach b, conclude that the sum of the series 
cannot have discontinuities in the original interval a < z < b. 
Analogous conclusions are also true for the properties enumera- 
ted below. 

If the terms of a series of form (41) are discontinuous we can apply 
the above argument and conclude that if series (41) converges uni- 
formly its sum can have discontinuities only at the points where the 
terms are discontinuous. In contrast to it, if a series converges in 
the mean its sum can have new discontinuities and, moreover, it 
can be discontinuous even when all the summands are continuous 
functions. This is connected with the fact that continuous functions 
S,, (z) can converge in the mean to a discontinuous function, as it. 
is illustrated in Fig. 270. 

2. A uniformly convergent series can be integrated termwise, 
i.e. under this assumption (41) implies, for any z) and z from the 
interval a < x < b, that 


x x x x 
| Aod f fa(t)dt+...+ j fa(t)dt-+...= j S(t) dt 

xo xo xo xo 

where the series thus obtained is uniformly convergent on the inter- 
val ax<a<b. In fact, we have 


| f soa- f noa|=] j [so—Snw]a|- 
x0 k=1 xo y Xo h=1 
= fiso- (1 ae] <| fiso- ola]|< 


b 
-< JIS @— Sn (0) [at < (0—0): max | S ()— Sn (D lz 0 


In the case of convergence in the mean we can write the same 
inequalities omitting the last one and thus prove that a series con- 
vergent in the mean can be integrated term-by-term and that the 
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series obtained after the integration uniformly converges on the 
interval a < z < b. 

3. A series with continuous terms can be differentiated termwise 
if this results in a uniformly convergent series, that is under these 
assumptions (41) implies 

A@+hOt+-.-+th@M+.-.=S8' @ 
To prove the property we denote the sum of the latter series by 


Q (z). Then integrating this series term-by-term (which is permis- 
sible on the basis of property 2) we arrive at the equality 


S (x) —S (xo) = \ Q(t) dt 


Bea 


Finally, differentiating the last relation, we obtain Q (z) = S’ (x) 
which is what we set out to prove. 

It is possible to specify the notion of convergence of a func- 
tional series with the help of generalized functions (see Sec. XIV.27) 
and then all the restrictions imposed on the character of the con- 
vergence may be dropped and we can integrate and differentiate 
termwise any convergent (in the generalized sense) series. 


§ 3. Power Series 


10. Interval of Convergence. A power series is written in the form 
C SS E (43) 


We have already encountered series of this type in our course (see 
Sec. IV.16). Now we proceed to give the general theory of such 
series. For the sake of simplicity let us suppose that there exists 
a finite or infinite limit of the form 


lim -al =R (44) 


noo lmi |) 
although the final results of our investigation will be valid for the 
general case. 


We can easily find out for what numerical values of x series (43) 
converges. Since 


+1 
Linn Goa ee eee lal 
an+1 


we conclude, on the basis of D’Alembert’s test (see Sec. 2), that 
series (43) is absolutely convergent for |z |< R, i.e. for 


ERA a (46) 
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Interval (46) is referred to as the interval of convergence of power 
series (43), and R is called the radius of convergence. Series (43) 
diverges for | z | > R, that is for —o0 < z < —R and R<r<o. 
Indeed, outside the interval of convergence limit (45) exceeds unity 
which implies the divergence. Limit (45) is equal to 1 for z = +R, 
that is D’Alembert’s test is inapplicable to the end-points of the 
interval of convergence. Examples show that depending on the 
particular properties of a power series it can be convergent or diver- 
gent at an end-point of its interval of convergence. 

Limit (44) does not exist in some cases but even then the interval 
of convergence can sometimes be found by means of D’Alembert’s 
test. As an example, take the series 

z3 xê z? x12 
1—2 tza aa tE 


Limit (44) does not exist for the series (why?). The series converges 
for the values of z which yield 


A [amn] i; | 28" | }- 
lim {p gaem Gea le <4 


Hence, the series converges for | a | < 2? = 4, that is its interval 
of convergence is a7, Pear yz. The series diverges at the end- 
point z =% 4 of the interval of convergence and conditionally 
converges at the end-point z = a 4 (check up these assertions!). 
If D’Alembert’s test is inapplicable we can nevertheless prove 
that the domain of convergence of a series of form (43) is an interval 
of type (46) but it is more difficult to find the value of R in this case. 
lf R = œ series (43) converges for all z, i.e. over the whole z-axis, 
although it can be inapplicable for practical calculations when the 
values of |x | become large (see the end of Sec. 4). The case R = 
is also theoretically possible. But then series (43) converges only 
at the single point z = 0 and therefore we shall not treat such series 
here. 
Let the reader verify that the radii of convergence for series 
(IV.55)-(IV.61) are equal, respectively, to co, 00, 00, 00, oo, 1 and 1. 
We also consider power series of the form 


ao + a; (z — a) + als — a)? +... +a, (e—a)® +... (47) 
Denoting z — a by z, we reduce the series to form (43) (with respect 
to the new variable zı). Hence, the series converges for 

—R<x—a<R, ie. for a—R<x<a+R 


44. Properties of Power Series. 
4. A series of the form 


a + aye + ag? $6. + az” +... (48) 
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uniformly converges (see Sec. 8) on each interval —Ry Cz < R; 
where 0< R, < R and R is the radius of convergence of series 
(48). Indeed, we can write 


lao |= laol, Jar |<] aR, |, | azz? | < | a2Ri |, 
| asa? | < | asRi |, .- . 


for such an interval and hence the terms of series (48) do not exceed 
the corresponding terms of the series 


lao | + laRy | + | aR | + lasRi I+... 


in their absolute values. The latter series converges since R, lies 
inside the interval of convergence. Thus, by Weierstrass’ test (see 
Sec. 8), series (48) uniformly converges on the interval. 

In the general case a power series may not uniformly converge 
on the entire interval of convergence. But Abel proved that if series. 
(48) converges at an end-point of the interval of convergence, the 
interval on which the uniform convergence is guaranteed can be 
extended to this end-point. 

2. The sum of series (48) is continuous inside its interval of con- 
vergence. Indeed, this follows from property 4 in Sec. 9. Besides, 
Abel’s theorem mentioned above implies that if series (48) converges 
at an end-point of the interval of convergence its sum is continuous 
at the end-point. 

3. Term-by-term differentiation or integration of series (48) 
does not change its radius of convergence. For instance, integrating 
series (48) termwise we obtain the series 


a a ? a 
Apt + x? + a t -o HEH gea aei aee 


Let us compute its radius of convergence: 


lan-i| 
Saen pee Be grim eet A 
n= ae n=% n> ” n>% |an 


Thus we obtain the same value of the radius, see (44). 
4. The relation 


Qo T ae Fo A ano VERN (ER e 2 < R) 


can be termwise integrated and differentiated any number of times. 
This follows from properties 1 and 3 proved above and properties 
2 and 3 in Sec. 9 because if the radius does not change when we inte- 
grate or differentiate once it cannot change when the operations are 
performed repeatedly. 
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In particular, properties 4 and 2 imply that the sum of a power 
series possesses continuous derivatives of all the orders within its 
interval of convergence. 

For example, let us take the series 


1 
yar Deh rat (-—1<2*<1) 
which can be obtained from series (IV.60) by putting a = —1 or 
by applying formula (IJI.7) for the sum of an infinite geometric 
progression (with common ratio less than unity in its modulus). 
Integrating termwise we obtain 


x 


1 x 5 7 
|ia dx=aretanz=2— +i rt (49) 
0 


(—1<2<1) 


The series on the right-hand side being convergent for z = 1 as 
well. Abel’s theorem implies the validity of formula (49) for z = 4. 
Hence we have managed to find the sum of an interesting series of 


the form 
4 f l 4 a 
1—s4+5-7 Toa ... = arc tan 1 =-7 


Performing termwise integrations and differentiations of a given 
series we can sometimes reduce the series to a series whose sum is 
known and thus find the sum of the series in question. As an example, 
let us find the sum of the series 

3 4 Suis hs 
244724 oy +a x + eee =S (z) 


By D’Alembert’s test, we readily conclude that the series converges 
throughout the whole z-axis, that is R = œ. Let us multiply both 
sides by z and integrate the result from zero to some z: 


x x Me $ 
CO AE ak aE A 
0 


x z? z3 z 
= g? (fa tartar +... .) =a 
We obtain, by differentiating, 
aS (x) = (xe*)’ = 2xe* + ze” 


i.e. finally we have 


S (z) = (2 + 2) & 
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Let us consider an example of different kind. Let it be necessary 
to find the sum of the series 


2 z3 a4 5 
potegtigtpst ---=9(2) (50) 
For this purpose we differentiate it termwise: 
, 2 z3 ri 
a os ob ee ale 


[see formula (IV.64)]. It follows that 
0 (2) = — f In (4—2) dz = —z ln (1— z)— f 2 dr= 


1—<z 
=2+(1—2)In(i—z)+C (51) 


To determine the value of C we put z = 0 in formulas (50) and (51). 
This yields 0 = ø (0) = C. Finally, 


o (xz) = xz + (1 — zx) ln (1 —2z) 


In other examples of this type we can encounter integrals which 
are inexpressible in terms of elementary functions. 

The sum of a functional series can sometimes be found by forming 
a differential equation which is satisfied by the sum and solving it. 
For instance, let us find the sum of the series 


4 7 10 13 
Hitita ato 
To do this we differentiate formula (52) three times: 
3 6 9 12 Pi 
1ta ta totrt =o (a) fee) 
2 5 8 AL 
atata tirto =p" a), (54) 


10 


z zt x? m 
aA tat O ea @) 
We see that we have arrived at the original series, i.e. 
p” (z) — p (z) = 0 


APEE the methods of Sec. XV.17 and solving the equation we 
find: 


EEEN V3 SRE 

p(z) =Cye* +e (Creos K2 2 +¢, sin -y~ z) (55) 
To compute C,, C> and C3 we substitute z = 0 into formulas (52), 
(53) and (54) which results in p (0) = 0, p’ (0) = 1 and pe (0) = 0. 
These values determine the initial conditions for p (x). By formula 
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(55), we obtain 


C.46,—0, GaSe Ee 21 "Ge ee, 6 


2 


which implies 
1 7? 22> 3° 32> V3 


Finally, the sum of series (52) is expressed as 


x ahr ae x10 1 -3 1 V3 
tte t rt ae te F(— gos e+ 
OD 
Lag otra ) 


In other cases we can sometimes similarly reduce the sum of a 
given number series to an integral or even to a simple combination 
of mathematical constants (i.e. to integers, numbers x and e etc.) 
and functions of the constants. As an example, let us take the sum 


1 4 1 ce 
ptatat---=s (56) 
To compute it, we introduce an auxiliary series of the form 
2 23 
gtatat: =I (—1<2z<1) 


Differentiating the series we derive 


q (z)=— j aa A = dx 
ò 


(check it up!). Substituting z=1 we obtain the sum of series (56): 
1 

s=—| int) de (57) 
0 


The corresponding indefinite integral is not an elementary func- 
tion but nevertheless it is sometimes preferable to express the sum 
of a series in the form of a definite integral. By the way, in Sec. 25 
we shall give another method of computing the sum of series (56) 
which will show that the sum equals =: From this, in particular, 
we obtain the numerical value of integral (57). 

12. Algebraic Operations on Power Series. The power series being 
absolutely convergent in the interior of their intervals of conver- 
gence, we can termwise add them together, multiply by a common ` 
factor (see Sec. 3) and multiply them by one another following the 
rules of multiplication of polynomials (see Sec. 6). 
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For example, let us consider the multiplication of the power 
series expressing the functions e* and In (1 + 2): 


@in(i+a)=(444+54+44---) (e-S4+5-4 Fareje 


a ES a ce en ES E 
fel eg tT ge at Te 


‘Obviously, we can compute as many coefficients entering into the 
product as needed. The radius of convergence of the first series is 
equal to oo whereas that of the second series is equal to 1. Hence, 
the above result is valid for the interval —1 < z <1 in which 
both series are absolutely convergent. 

The division of series is performed in a similar way. For instance, 
consider the ratio of the. power series for sin x and Cos @. 
To perform the division we take these series arranged in ascending 
power of the variable and divide them as if they were polynomials: 


EA x3 xd a % «3 z5 ax? 
thee os ot mw a ee soot 
cos x z2 xt a6 a2 at zê £ 

ema aE A 


z3 2 17 
See E Ea 
a+ 3 +g? tyg“ te 


2 4 CA RE UE VESI PEE 
AE E ETEN a) E EE 
A i o A tho OT 
x z5 a 
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Thus, the expansion of the tangent into a power series is of the 
following form: 


Galtier A Tete 
lang=— tt a tp T35" peers (58) 


To compute the subsequent terms in (58) we must put down the 
subsequent terms in the expansions of sin x and cos z and continue 
the division. It can be proved that formula (58) is valid for | z |< 
<4. 

Expansion (58) can also be obtained by means of the method 
of undetermined coefficients. To apply the method we note that 
J tan x is an odd function and therefore its expansion must contain 


only the odd powers of z: . 


tan 2 = a + a2? + age? + aa? +... 


But cos z-tan z = sin z, and therefore 


2 zt z8 
(a-f at) (aix + azt? 4 ast’ -Hat 4 o) = 


Z z3 z5 a 

S or a1, Ur de 
Removing the brackets and equalling the coefficients in like powers 
of x we obtain: 


4 ay 4 az a 1 
= oa et Ot a or 


Hence, the coefficients ai, @3, 4%, +--+ can be found in succession 


from the above relations. ate 
Another method of manipulating series 1s the substitution of 
a series into a series. For instance, we can substitute a series of the 


form 


y =f (a) =a + aye + ayn? +... 


into the series 
@ (y) = bo + biy + bey? +.. (59) 


or, in a more general case, into the series 
p (y) =etaty—a+e.(y—a)+..-- 


43—0141 
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For instance, taking series (59) we obtain the expression 


@ (f (2)) = bo + Bs (ao + aye + aa? +...) + 
<5 bs (Ap at te Got? «Pa es 


in which we must remove the brackets and combine similar terms. 
For the result to be valid for z = 0 it is necessary that series (59) 
converge for y = do (why is it so?). This means that a) must lie 
within the interval of convergence of series (59). By the way, if the 
condition is not fulfilled the calculations themselves will indicate 
the mistake. 

Take an example: 


$ sin z (sin z)? (sin z)? Ka 
In (1+ sin 2) =— amet =.= 
x3» x x? Or ae 2 
ee oe ee (=F +i) 
3 i ort 
3 2 3 Ao eee i 
(atm) Wate 6 a 
3 Og 1 — 
ee) z 13 
EEA ie ae pia 
ee Lae z3 zro te 
2 3 F 
2 5 
4 76 5 x7 
AR ead fee GUE Ge a a 
— Sir oys O 


When calculating in succession the powers entering into the resulting 
series we have performed the multiplications for the Taylor series 
of sin z according to the rule of multiplication of polynomials and 
dropped the powers of x higher than the power corresponding to the 
desired accuracy of calculations (higher than 2’). Now, combining 
similar terms we finally derive 


A 2 3 4 5 6 x? 
niemer aa E eee ten La? 
G+ ) 2 a 6 ta + 45," 5040 °“*" 


The methods discussed in Secs. 11, 12 make it possible to obtain 
the expansions of many other functions by taking advantage of the 
simplest series given in Sec. IV.16. It is sometimes difficult to write 
down the explicit expression of the general term of such an expan- 
sion but at the same time we can always find any number of terms 
which is usually sufficient for practical purposes. 
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13. Power Series as a Taylor Series. We now consider the sum of 
a series of the form 


f(z) = ao + aye + age? +... + ar" +... (60) 
(eso hk) 


The coefficients of the series can be easily expressed in terms of 
its sum. For this purpose we perform successive differentiations of 
formula (60) and substitute z = 0 into the results, as it was done 
in Sec. 1V.15. This yields 


f (0) =a; 

f (a) = 104+ age +Bag22-+..., F (0)=1ay 

f" (x) = 1- 2a, 42- 3agt +3 4a? + ..., f (0) =1-2a,; 

j" (a) =41+2-3ag-+2-3-4ayr43-4-5a5z?4+-..., f"(0)=1-2-3a, etc. 


Finding ao, a, @2, -.- from these relations and substituting 
them into (60) we obtain the expression 


"(0 ” (0) m0 
£2) = 10) + 2 LO 2 A wy... 
ip EEE ohoee (-R<«z<R) (61) 


which is nothing but Taylor’s series (IV.54) we have already dealt 
with. Thus, a power series is Taylor’s series of its sum. 

The coefficients of series (61) being uniquely expressed in terms 
of its sum, we can assert, in particular, that if the sums of two power 
series are identically equal their coefficients in like powers of x 
also coincide. Accordingly, if the sum of a power series is identically 
equal to zero all its coefficients are also equal to zero. 

In the above argument we have regarded a power series as being 
given, But in practice we usually deal with the reverse problem of 
expanding a given function f(z). Then, naturally, we encounter 
the problem of determining the range of the values of z for which 
formula (61) is valid. On the basis of Sees. IV.15, 16, we conclude 
that this is equivalent to the question what are the values of x for 
which the remainder of the corresponding finite Taylor formula 
tends to zero when the number 7 increases. 

An exhaustive investigation of the remainder can be carried out 
only in some simple cases. Fortunately, we can easily do without 
such an investigation when we deal with elementary functions 
because it is possible to prove that formula (61) is valid for an ele- 
mentary function f (x) on every interval in which the series conver- 
ges provided that the direct substitution of z = 0 into the expres- 
sions f (x), F (a), f (z), ... yields the finite results f (0), f’ (0), 
f” (0), .. . - This implies, in particular, that the expansions given 


43% 
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in Sec. IV.16 are valid for the intervals of convergence of the cor- 
responding series. 

It should be remarked that there are many functions that cannot 
be expanded into power series (i.e. into Taylor’s series). For instance, 
a function which is discontinuous in an interval or has a derivative 
of the first or of a higher order with a discontinuity on the interval 
cannot be represented by a power series. The same is true for a func- 
tion which is represented by means of different formulas on different 
parts of the interval under consideration. 

All that has been said here is directly extended to series in po vers 
of x — a of form (47) and to corresponding Taylor’s series (IV 53). 

14. Power Series with Complex Terms. These series are of the . orm 


a + az +H az +H.. Han HHn z= iy 62) 


where the coefficients a, and the independent variable z can assume 
any complex values. The theory of these series is analogous to that 

of real power series but the inequality 
J |z |< R defining the values of z for 
© which series (62) is convergent repre- 
sents not an interval but a circle in 
the complex z-plane.. The circle is re- 
ferred to as the circle of convergence 
of series (62) (see Fig. 338). Simi- 
larly, for a series in powers of z — a 
where a is a complex number the 
domain of convergence is specified by 
an equality of the form |z — a | < R 
which defines a circle of radius R with 
centre at the point a. If R= the 
series converges throughout the whole 
complex plane. The properties enumerated in Secs. 14 and 12 can 
be transferred to the series of form (62) without essential changes. 
The sum’S (z) of such a series is a complex function of a complex 
variable (see Sec. VIII.11). For these functions the notion of an 
integral is introduced by means of the notion of an antiderivative. 
In Sec. VIII.4 we considered several examples of defining a fun- 
ction for complex values of its argument by means of series of 
form (62). 

As it was mentioned in Sec. VIII.4, the identities which are satis- 
fied by functions for real values of the argument remain valid for 
its complex values when we continue the functions in the complex 
plane by means of the corresponding power series. To illustrate 
what has been said we take the equality 


Cy 


Fig. 338 


eln(i+x) — 4 +r s (63) 
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It is valid for real z, by the definition of the logarithm. Hence, if 
we substitute the power series of y = In (1 + zx) (see Sec. 12) into 
the series of e” and perform the corresponding identical transfor- 
mations (removing the brackets etc.) we must obtain 4+ 2. Per- 
forming the same operations for the complex values of the argument, 
i.e. writing z in place of z, we obtain 


eln(i4z2) —4 +2 (64) 


which is what we set out to prove. 

This formula implies, in particular, that the definition of the 
logarithm by means of power series (IV.61) into which z is substi- 
uted for z is coherent with the definition of the logarithm of a com- 
plex number given in Sec. VIII.5. The sum of the series represents 
only one branch of the infinite-valued function In (4 + z), namely 
the branch which yields zero when z = 0 is substituted into the 
function. 

Formula (VIII.7) and many other formulas can be proved in 
a similar way. 

{5. Bernoullian Numbers. The so-called Bernoullian numbers 
discovered by Jacob Bernoulli are widely applied in, the theory 
of series and particularly in the theory of power series. These num- 
bers, which we denote by Bi Bo, Bs, Bas - - -» are defined by a 
- symbolic recurrence relation of the form 


(B+ 1)" 1_p'i—0 (n=14, 2, 3, see) 


in which p” should be replaced by ĝa after the brackets have been 
removed on the left-hand side. 

For instance, putting n = 1, n = 
obtain, respectively, 


2, n = 3, in succession, we 


Bo-+ 2B, -+ 1 —B2=0 
which yields B;= at, 
Bs + 3B2-+ 36: + 1—Bs =9 


1 
which yielda Bp=——"g-——=g» and 


Bat 4Bs + 6B2 + 481+ 1 — Bs =0 
which yields fp = 0. 


The subsequent calculations result in 
1 1 1 
h= P= hsz. p =0, Be=-— z 


691 7 
Po=0, B= wy Bu=9, B=- p75: Bis=90, Bu=q -- 
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It is possible to prove that all B, with odd numbers n > 3 are 
equal to zero. Introducing the notation 
d Bee (yee pe (n= 1,2, 3; 7.3.) 
we write 
1 1 1 1 5 691 
Bi-G Fa=m B= gy, Bama Die eao + 
These numbers are also referred to as Bernoullian numbers. 
We now put down some formulas involving Bernoullian numbers. 
For the function ¢ (x) (see Sec. 4) we have 


Sar Balak, 
2 war $2 = Sem MERR) (65) 


In particular, this yields 


oo o 
1 Bin 2 1 By(2a)t_ mi 
tog et at aa Ot 


n=1 n=1 


Formula (65) shows that all the numbers B, are positive. Further, 
we have ` 


a = 22n (22n_— 1) 
tanz= >) Sear Ba 
[compare with formula (58)] etc. 
16. Applying Series to Solving Difference Equations. A difference 
equation connects an unknown quantity and its finite differences 
(see Sec. V.7). We first turn to the case when the unknown quantity 


is represented by a sequence dp, ai, G2, ..., Ay. A difference equ- 
ation defining the sequence can be put down in the general form 
f (G00, y Aa APG) 0 (n= 0, 1, 2, ...) (66) 


if we restrict ourselves to equations of the second order. Here Aan = 
= An — an and A*a, = Aan, — Aan. Substituting 
Aan = an — ap and A®a, = aniz — 2an + On 
we reduce (66) to the form 
g (n, An, Antis an+) = 0 (n = 0, i aj e -) (67) 

In a particular case the left-hand sides of equations (66) and (67) 
may not contain all the arguments put down there. 

To solve equation (67) we can arbitrarily set two values of the 
unknown quantity, for instance, a) and a. Then we find az by 
putting n = 0 in (67). Further, putting n = 1 in (67) and substi- 
tuting the value a, found above we determine a, and so on. This 


step-by-step procedure enables us to find any desired number of 
terms of the sequence ap. 
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A particular case of equation (67) is a homogeneous linear equation 
with constant coefficients which is written as 


adn + Bany + Vän =0 (n=0, 1, 2,...; % B, y = const) (68) 


The solution of equation (68) can be found in the general form 
by means of a method which is illustrated below. Equations of any 
order can be solved in a similar way. 

Let us form the so-called generating function of the sought-for 
sequence whose expansion into a power series of the form 

Q = ay + aye + age® t.e t ntt 
gives rise to the terms of the sequence -as the coefficients in the 
series. It is only the coefficients of the series that we are interested 
in here and therefore we are not going to consider any numerical 
values of z and the question on the convergence of the series. In 
such a case a power series is referred to as a formal power series. 
We can easily find the product 
(y- Ba + aa?) Q= yao -+ (Bao + va) z+ (tao -+ Bay + Yaa) 2+ 
+ (aay + Baz + Yas) F +++ 

Equation (68) implies that all the coefficients on the right-hand 

side, from that in 2° onwards, are equal to zero. Performing the 


division we derive FE 
— 4% Vai) = 
g y+ peas? (69) 
The values of ao and a; given, we have the ratio of two polyno- 
mials with the given coefficients on the right-hand side. It can be 
decomposed into partial fractions of the form Ga by applying 
the methods discussed in Sec. VIII.10. Each fraction can be rewritten 


as 


a)* 


Ii Rie pias he Ai) a A aS 
(c—a)” a (—a)* (1-4)" (t—yx)* 
a 


where B = z and y =+. We have æ = 1 or œ = 2 for fraction 

—a4 
(69) but for difference equations of higher order the values of œ 
may be greater. These fractions are expanded into power series 
according to the formulas which are obtained from the formula of 
the sum of a geometric series by means of differentiation: 


= =B+ Byz + Byr + eo PBT Hes 


B 1 B , 
(O= Ea See 
= B+ 2Byr + 3By x? + veo + (0-44) Byt2? +... 
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and so on. Adding together the coefficients in x” entering into all 


the above series we thus determine the coefficient a, in the series 
Q = a) + aye +a? + 2... Hant” +... and hence equation 
(68) has been solved in the general form. 

We suggest that the reader apply the above method to deriving 
the general formula of the so-called Fibonacci numbers a) = 0, 
Gy 1, a> = 15g = 2, ag— 3, a5 =, -... each of which, from 
the third onwards, is equal to the sum of the two preceding num- 
bers. The sought-for expression is of the form 


(1+ V5)r—(1 — V5)" 
an = 
V5-20 
We also consider difference equations in which the unknown quan- 


tity is a sought-for function y (x). In this case instead of (66) and 
(67) we have equations of the forms i 


f(z, y, Any, Aky) =0. and (a, y (z), yz +h), y(z+2h)=0 
(70) 


respectively. 

_ The latter case can be reduced to the former in which the unknown 
quantity is represented by a sequence. For example, let 0 < £ < ©. 
We introduce the notation a, = y ( + nh) (n=O, 1, 2,.-- 
where € is a constant number lying in the interval 0 < Ẹ < h. 
Putting z = €-+ nh in equation (70) we can rewrite it as 


p (E-++ nh, an, On+1> An+2) =a 


and hence, for a constant &, we obtain an equation of form (67). 
After a, has been found we can use the arbitrariness of the choice 
of € and thus obtain the sought-for solution y (x). In particular, 
it follows that we can arbitrarily set the values of y (x) in the inter- 
val 0 <a < 2h for equation (70) (why is it so?). 

17. Multiple Power Series. The role of multiple power series in 
the theory of functions of several arguments is similar to that of 
ordinary power series in the theory of functions of one independent 
variable. For the sake of simplicity we shall restrict ourselves to 
double power series. The power series of higher multiplicity are 
treated similarly. 

To put down the expression of a double series it is convenient 
to use the double index notation which was applied to writing 
a double number series. in Sec. 6: 


œ oo 


S (x, yis > ; ümre y= 


m=0 n=0 


= aoo + Ao + Agsy Hat? 4- aty + aoy? +... (71) 
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If the series is absolutely convergent the order of performing its 
summation does not matter. 

The domain of convergence of series (71) is a region of the z, y-plane © 
(which may coincide with the whole plane in a particular case). 
Sueh a domain can be of the form represented in Fig. 339. For any 
fixed y we obtain a power series in powers of z whose radius of con- 
vergence R may depend on y, i.e. R = R (y). Therefore the domain 
of convergence is symmetric with 
respect to the y-axis. The symmetry 
with respect to the z-axis is implied 
by the same argument. In the case 
of absolute convergence of se- 
ries (71), R (y) is a non-increasing 
function of y for y > 0 (why?). 

Series of the form 


YY amn(z—a)™(y—b)" (72) 
m=0 n=0 
are treated similarly. The domain 
of convergence of such a series is 
a plane figure with centre of sym- 
metry at the point (a, b). 

The properties of multiple po- 


wer series are analogous to those 
of ordinary power series (see Secs. 41 and 12). In particular, mul- 


tiple power series are obtained in expanding a function of several 
variables into Taylor’s series (see Sec. XII.6): 


fa (0, 0) fy (0, 0) fzx (0, 0) 
je n-ro a aa et 


Foy (0, 0) fy (9 9) 
ieee 


Fig. 339 


Series (72) are obtained in the same way. Multiple power series are 
applicable when we use the small parameter method (see Secs. V.5 
and XV.27) for an equation containing several parameters. They 
can also be utilized in many other problems. 

18. Functions of Matrices. Let A be a square matrix (see Sec. XI). 
For definiteness, let it be of the third order (the results that we 
shall obtain here are true for matrices of any order). In Secs. XI.2 
and XI.3 we introduced such simplest functions of matrices as A? 
and A~. But in what sense should we understand an expression 
of the form e* and the like? The importance of the exponential 
function in mathematics indicates the advisability of putting this 


question. 
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It turns out that a reasonable answer to the question can be 
obtained by means of power series if we apply them by analogy with 
Sec. VIII.4 where the notion of a function of a complex variable 
was defined. Suppose that we are given a function f (x) which can 
be expanded into a power series 


f (a) = ay + aye + aat? + age? +... + ana" +... (73) 
By definition, we write 
f(A) Sal + aA + aA? 4+...+4,A°+... (74) 


where I is the unit matrix of the same order as A. For example, 
we have 

AS pa 
TRAN aa 


An 
nl 


A A2 F 
eA=I+ 1i + TT i Horee (15) 
The definition makes sense if series (74) converges. We can pcint 
out a simple condition for the convergence. Let us assume that 
series (73) has the radius of convergence R and suppose, for sim- 
plicity, that all the eigenvalues A, As, Ay of the matrix A (see 
Sec. XI.4) are distinct. Then, as it was shown in Sec. XI.8, the 
matrix A can be transformed to the diagonal form, that is there 
exists a non-degenerate matrix H for which 


HAH = diag (M, Ao, As) =A 
But this implies 
A=HAH", A?=(HAH")-(HAH) = HA?]H", 
A’ = AA = (HA?H"') (HAH) = HA5H* 
etc. Consequently, series (74) can be rewritten as 


Ha, IH + Ha,AH + Ha AH ++... = 


=H (al + uA H aA n a) HO (76) 
A diagonal matrix can be easily raised to a power: 
A? =diag (AM, 43, M3), A®=diag(A3, A3, M) (77) 


etc. 


(let the reader verify that in the general case of multiplication of 
diagonal matrices the product is a diagonal matrix whose elements 


are the products of the corresponding diagonal elements of the 
matrix factors). Therefore, we have 


aol + aA + aA?+...= 
= diag (ao + Udi- aA + -p dot arho tH adat ..- 
ey lo F Qyh3+ adit nn) (78) 
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If the series forming the diagonal converge series (76) converges 
as well which leads to the convergence of series (74). 

Thus, we can assert that if all the eigenvalues of the matrix A 
do not exceed R in their moduli [where R is the radius of convergence 
of series (73)] series (74) is convergent and even absolutely conver- 
gent. If at least one of the eigenvalues exceeds R in its absolute 
value series (74) is divergent. It can be proved that the above result 
is also valid for the case when the matrix A has multiple eigenvalues. 

Formulas (76) and (78) also imply the following formula appli- 
cable to a matrix A which can be transformed to the diagonal form 
by means of a matrix Ht 


f(A) =Hdiag (f (M), $ (àa), $ (As) H 


What has been proved implies that series (75) is convergent for 
any matrix A since the corresponding series (IV.55) has the radius 
of convergence R = oo. Another important example is the series 


T+A+A?4...+A"4+..- (79) 


which converges if all the eigenvalues of the matrix A do not exceed 
unity in their moduli (why?). 

Many other properties of ordinary functions can be extended to 
functions of matrices. These properties can be proved by manipu- 
lating series after a manner of deducing formula (64) from formula 
(63). For example, the identity 


atret...) (0-a =r (ta) =! 


implies the relation 
+A +A +...) AA) Ei 


i.e. the sum of series (79) equals (I — A)~ if it is convergent. At 
the same time we should take into account that when proving some 
properties by means of series we use a permutation of factors (for 
instance, the relation ab + ba = 2ab) which may not be applicable 
to matrices. For instance, we apply the above relation to deducing 


the formula 
eA eB = eAtB 


and thus it is valid for commuting matrices A and B and is inappli- 
cable when they are non-commuting. 

As an example of applying the notions introduced in this section, 
let us establish certain conditions guaranteeing the convergence 
of the iterative method of solving a system of linear algebraic equ- 
ations. System (VI.19) can be rewritten in the vector form 


x =Ax+ 6 (80) 
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where 6 is a given vector, A is a given coefficient matrix and x is 
the sought-for vector. Taking an initial approximation xX = Xo 
we obtain the successive approximations according to the iterative 
method: 


x, =$+ Axo, 
x)= ò+ Ax, = ÖH A (Ò+ Axo) = Ò + Ab+ A’xo, 
xa = 6+ Ax = Ô -+ Ad-+ A6 + APxy 
and so on. Generally, we have 
xn =(I+A+A?+ ... + A") ô -+ Ax, (81) 


For the process to be convergent it is necessary that the initial 
approximations should not affect the result obtained in the limit, 
that is we must have A” —— 0. For this to be so it is sufficient, 


n => oo 
on the basis of formulas (77), that all the eigenvalues of the matrix 
A be less than unity in their absolute values. It is the last condition 
that guarantees the convergence of the iterative method. If it is 
fulfilled we obtain, passing to the limit in formula (81) as n —> %, 
the formula 
x= limx, =(1LA{A2+...+A"+.-0)8=(1—A)6 
n> 
The direct substitution of the vector x thus obtained into equat ion 
(80) shows that the vector satisfies the equation. (Perform it!) 
By analogy with vector functions of a scalar argument (see 
Sec. VII.23), we can consider matrix functions of a scalar argument, 
having the form B = B (z). Many of the properties of ordinary 
functions can be extended to this case. For instance, we often use 
the function 


B=eA* (—oo<a<oo, År= TA) 


where A is a constant matrix. Applying the series techniques we 
can readily prove that (e4*)’ = Ae4*. In particular, it follows that 
we have (e4*e)’ = Ae4*e for any constant vector c. But this means 
that the vector function of z having the form 


y = ese (82) 
is a solution of matrix equation (XV.149) with constant coefficients: 
y = Ay (83) 


If an initial condition of the form y |x=x, = Yo is given formula (82) 
implies 


Ax ; -Ax 
= 0, = 
Yo e C, 1.e. C= g Yo 
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Hence we obtain the following explicit formula for the solution: 
y= e^"e Anyo ai eo Ayo 


An arbitrary initial condition being satisfied, formula (82) represents 
the general solution of equation (83). 

19. Asymptotic Expansions. Asymptotic expansions introduced as 
early as the 18th century by H. Poincaré (1854-1912), a prominent 
French mathematician, are widely applied in modern mathematics. 
We shall consider the expansions in powers of = which are more 
often encountered than the expansions in z. But of course this dis- 


i x $ 4 hi 4 
tinction is not essential because the substitution — = % transforms 


an expansion in a (as z approaches infinity) to an expansion in 


x, (as xı — 0) and vice versa. For example, from series (IV.55) we 


directly obtain 
1 1 1 1 
ye oa u taat tae (84) 


e 


Let us begin with an example. We shall investigate the behaviour 
of the function 


j(a)= | -ds 

x 
[ which is equal to e (e- Erf z) ; see formulas (XIV.36’) and 
(XIV.72) | for z—> œ. Applying L’Hospital’s rule we can readily 


show that f (x) nes and hence (see Secs. II.8 and III.11) we have 
1 
i@=a+e (=) (x > ©) (85) 


To specify the expansion we integrate by parts: 


œ So. 252 
a A e A EA O a 
| estds = (— 0 a7) g| m a= 
x x 
4 4 pene 
N aE 
x 


We similarly verify that the last integral is equivalent to 


zs? i.e. we obtain ; 
4 1 al 
f®)=y- Ret? (=) (86) 


686 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


This as a more accurate expansion than (85) since the term 


3 
term in (85) as x— oo. Further integrations by parts result in 
still more accurate expansions 
1 1 1.3 1 La 
(Q= m7 R273 + 5835 +o(=) (87) 
1 E E E E 1 7 
f(x) = 3r r3? | 23x5 ~ 9427 +o (=) (88) 
etc. (Let the reader verify the calculations!) ; 
One can think that these operations should result in an expansion 
of the function f (x) into the series 
1 1, 4B 4-3-5 | 1-3-5-7 
2a Deas) 2825 daa? i) 25x? 
but D’Alembert’s test obviously indicates that this series has a zero 
radius of convergence, i.e. it diverges for all z! Therefore (89) cannot 
be used as an ordinary infinite series but formulas (85)-(88) show 
that we can use its partial sums. 
Now we proceed to give the general definition of an asymptotic 
expansion. A function f (x) is said to have the asymptotic expansion 


0 = entering into (86) tends to zero faster than the analogous 


(89) 


OLT a E (90) 
for z— oo if for any n = 0, 1, 2, ... we have the representation 
f (x) =a “! +... on +o (+) (for x > œ) 


This property automatically holds in the case of a convergent power 
series of form (84). But it can also hold when series (90) is diver- 
gent everywhere or convergent to a function distinct from f (2). 

When applying series (90) we restrict ourselves to a certain number 
of terms and drop all the subsequent summands. Then estimating 
the last of the remaining terms we draw a conclusion as to the values 
of x for which the partial sum thus chosen can be used. Asymptotic 
expansions with alternating signs of its terms are particularly 
convenient because in this case the expanded function lies between 


any partial sum with an even number and any partial sum with 
an odd number. 


§ 4. Trigonometric Series 


29. Orthogonality. Two real functions g (x) and h (z) defined over 
a finite or infinite interval a < x < b are said to be orthogonal 
to each other on the interval if 
b 
f g(x) h(x) dx=0 (91) 


a 
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The functions are supposed to be finite or infinite but they must have 
absolutely convergent integral (91). The application of the term 
“orthogonality” is accounted for by the fact that formula (94) turns 
out to be in many respects analogous to the condition of the ortho- 
gonality of two vectors given in the form of their resolutions with 
respect to a Cartesian basis (see Secs. VII.10 and VII.20-21). 

A system of functions is referred to as being orthogonal on an 
interval if any two functions belonging’ to the system are orthogonal 
on the interval. One of the most important orthogonal systems is 
the system of trigonometric functions 


4, cosa, sinz, cos 2a, sin2z,..., cos na, Sin na 2s (92) 
which is considered on the interval —x <x <n. To prove the 
orthogonality we compute the integral 


T 


7 
j cos nat cos ma dz = -y J Icos (m—n) z+ cos (m+n) z] dz = 
-r 


-1 


1 f sin (m—n) x sin(m--n)c]t _ 
aay w E (93) 


m—n 5 m+n -7 


(for mn) and, similarly, evaluate the integrals 
T 
j sinnasinmadx=0 (form=«n) 
' -x : 
and 
= TAF 
f cosnzsinmzds=0 (forany m, ne A 2 a) 
-1 


System (92) is also orthogonal on the interval 0 < z < 2m and, 
generally, on each interval of length 2x. This follows from property 
10 in Sec. XIV.4 if we take the product of two functions of form (92) 


as f (x) and put A = 2m. j ; 
If we apply the property of an integral of an even function (see 


property 9 in Sec. XIV.4) we derive from (93) the relations 
n 


a 
f cosnz cos mg dx = 2 \ cos nz cos mg dx =Q 
-^ 0 


(m, n=0, 4, Degen =n) 
which means that the functions 


Wir COS-dy 1 COStAL; 6 -a0ey COS ND aa (94) 
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form an orthogonal system on the interval 0 < z < m. We similarly 
verify that the functions 

sinz, sin Qz,..-, CSIR NE; ose (95) 
also constitute an orthogonal system on the same interval. [Let 
the reader verify that system of functions (92) is not orthogonal 


on the interval 0 < z < %7.] 
Changing the scale along the z-axis by introducing the scale 


factor + we transform functions (92) into the functions 


4 nr n NE 20x = 2z _ nne 
, cos—, sin——, COs SU ay xe ving COS j 
l l l l l 
. nar » 
sin eS (96) 


l 


which form an orthogonal system on the interval —l < 7 < L 
Applying this technique we can uniformly stretch the intervals on 
which systems of functions (94) and (95) (and, generally, any ortho- 
gonal system) were originally defined. We can also substitute z + h 
for z where h is an arbitrary constant, i.e. shift the graphs of the 
functions forming an orthogonal system along the z-axis, which 
aly not affect the orthogonality of the system (on the shifted inter- 
val). 

There are many orthogonal systems of functions other than the 
trigonometric functions. For instance, let us construct a system of 
orthogonal polynomials on the interval —1<2<1. Take the 
system 


ieee an. (laos 1) (97) 
The first two functions of the system are orthogonal to each other: 


1 
2 
pratos wey 
Thus we can put Po (£) = 1 and P; (£) = x. But the third function 
in (97) is not orthogonal to the first one (check it up!). To obtain 
a third function orthogonal to the former two functions let us take 
a linear combination of the first three functions of system (97): 
P, (£) = ax? + bz + c. The coefficients a, b, c must be chosen 
in such a way that P, (z) be orthogonal to the polynomials Po (*) 
and P, (z) constructed above: 


4 1 
| (a° +br+0):1:dr=0, and | (ax?-+ ba +0)-2-de=0 
bed | 


From this we find (verify the result!): 
b = 0," a Ssepe O Pa (x) =e (—3r? + 1) 
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The constant ¢ is an arbitrary quantity here. It is usually so chosen 
that P, (1) = 1. (Such a choice of one of the equivalent objects 
is called the normalization.) Thus we obtain ¢ = — Ea and finally 
we have 
3 1 
P2(2)=5 2-3 
To construct P, (x) we take a linear combination of the first four 
functions of system (97), i.e. Ps (z) = ax? + bz? + cx +d and 
choose the coefficients a, b, c, d in such a way that Py (x) be ortho- 
gonal to the functions Po (z), P, (z) and P, (z) already found. 
Applying this condition and introducing the additional requirement 
P, (1) = 4 we obtain, by analogy with the preceding calculations, 


5 3 
Ps (j= 52-z2 
(let the reader verify the result!). Similarly, we find 


P, (2) =~ (B5a*— 302243), Ps(2)= $ (6325 —702* + 152) 


etc. 

These polynomials are mutually orthogonal on the interval 
—1 <a<1. They were investigated by Legendre in 1783-1785 
and are called now the Legendre polynomials. The polynomials 
play an important role in various divisions of mathematics and 
physics. 

This orthogonalization process which we have applied to system 
of functions 97) on the interval —1 < z < 1 can also be used for 
any system of linearly independent functions on any interval if 
the integrals of the squares of the functions over the interval are 
convergent. 

21. Series in Orthogonal Functions. Let us be given a system 
of functions 


gi (2), Bal) ee Bn) (98) 
orthogonal on an interval a < z < b. We sometimes encounter the 
problem of expanding an arbitrary function f (x) defined over the 


same interval into a series in functions (98), i.e. into a series of the 
form 


f (£) = aiga (2) + a282 (x) + «++ + nn (x)+.-- = 2 angn (x) (99) 
where a, (n = 1, 2, 3, ...) are some numerical coefficients. This 


leads to the questions whether it is possible to expand any function 
44-0144 
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f (£), how the coefficients a, can be found and in what way series 
(99) converges. 

For simplicity’s sake, let us consider all the functions and the 
interval a < z < b to be finite. The answer to the first question is 
dependent on the choice of system (98). If expansion (99) exists for 
any function f (x) system of functions (98) is called complete. It can 
be proved that all the orthogonal systems mentioned in Sec. 20 
are complete on the corresponding intervals. 

We now proceed to determine the coefficients a, of expansion (99) 
under the assumption that none of the functions (98) equals zero 
identically. For this purpose we multiply both sides of (99) by 
gn (x) and integrate the result over the interval a < z < b: 


b b 
| 10) gn (a) dz =a: | gi (2) gn (x) de + 


b b 
+2 | g2(2) gn (2) d+ -se Han f àlds+... 


By the orthogonality of system (98), all the integrals on the right- 
hand side of the last relation are equal to zero except the integral 
of gè (z), and hence we deduce the formula for the coefficients: 


b 
§ f (£) gn (€) dz 


a 
a. — 


(ve=4, 2; 3, 353) (100) 


ews 


g3 (2) dz 


The coefficients being uniquely defined, we conclude, in particular, 
that if the sum of two series of form (99) is identically equal to zero 
the coefficients in the same functions g, (z) in the series are also 
equal and that if the sum of series (99) is identically equal to zero 
all the coefficients are also equal to zero. 

22, Fourier Series. The above general results can be applied to 
concrete orthogonal systems of functions. For instance, taking 
system (92) we conclude that any finite function defined in the inter- 
yal —t < z < n can be expanded in a series of the form 


f (£) = a + a, cos £ + az sin z + a, cos 2x + as sin 22 +.. 


Tt is convenient to change the notation of the coefficients and to 
put down the series as 


f (a) = ao + a cos z + by sin z + a, cos 2z + bg sin 2z +... = 


=+ os (an cos nz + bn sin nz) (101) 
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The coefficients of the series are found by formula (100): 


f teid oa 
m=" =a Jia 
iy 12 dz -a 
-x 
x 
§ f() cos nx dz Me 
an = E f f (2) cos nz dz (102) 
J cos? na dz -1 
=n 
x 
iy f (x) sin nz dz a 
bn == = 4 J f(a) sinneds (n>1) 
{ sin? nz dz =x 


-x 
The series with respect to systems of functions (94) or (95) are inve- 
stigated in a similar way: 


f(x) =ay+ Šan cosnz (0<r<1) (103) 


x 


a= | fader, ie | Festa (n>1) 
0 


and X 
f(a)= X bnsinns (0Kr<1) (104) 
n={4 a 
— f f (2) sinnz dz 


We also often use series in functions (96) and in functions 
obtained from (94) or (95) by changing the scale along the z-axis: 


f(z) =a+ X (an cos = + bn sin == ye) 
n=1 
1 wy )dz, a at { 7 (2) coo de 
=z J x, ’ n 1 4 L j 
Lf x 
~ nr 
bn => | f(a) sin Fade (n>1), 


l 
44 * 
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f(x) =a0+ > a,cos— (0<2<l) (406) 


n=1 
L l 
2 7 : 
a= | j(2)dr, an=> | j(a)cos = dx (n>1) 
and 


{j= msn OK<) (107) 
n=1 
L 
b= | Fesin de (n>1) 
0 


Series (101), (103) and (104) are special cases of series (105), (106) 
and (107) because the former can be obtained from the latter by 
putting l = x. They are all called Fourier series after the prominent 
French mathematician J. Fourier (1768-1830) who for the first time 
applied them in his investigations in the theory of heat conductivity 
although such series had been used before. Formulas (102) for the 
Fourier coefficients and some other similar formulas were obtained 
as early as 1759 by Clairaut and in 4777 by Euler. 

Consider some examples of expansions in Fourier series. Let it 
þe necessary to expand the function y = 2 in the intervalO < z <l 
in series (106). For this purpose we compute the coefficients: 


Tos 
and 
bs l 
; l 
an= > | scos Hide = 7-2 7, nET 
L 4 l l an lL jo 
l 
2 f Eont apt 00s | = 
Ae apa 1 0% an an elms 
eral 21 
= ag (cosan —1)=— sae i >) 


This yields 
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and finally we obtain 
=l [i-i (© cos T pA cos E +e 008 FT- ane) ] (108) 
O<«<l) 
As another example let us take a function which is defined by 
means of several formulas. Namely, let 
4 for —l<r<—l+a 
0-for —l-+a<2r<0 
f(@)=) 4 for O<r<a 
0 for ax<a<l 


where œ is an arbitrary constant number belonging to the interval 
0<a < l. The graph of the function is represented in Fig. 340b 


-+ 


y y 
anhain 
i i ee oe i ie Son 
—3i Zi +t g L EA -2L ot 0 L al g 
eia 


(a) 


Fig. 340 


by the part corresponding to the interval —l < z < l. Let us expand 
the function in series (105): 


L -l4a 0 a 
n-i | Harder ( J fadet | ær | 12) dr+ 
pas cul —l}a 0 
I i —-lta 0 a l 
+ | sas) ar f adit] OBE | Aae+ {0%} 
-4 @+0+a+0= 7" 
Į -l+a 
a=. f(a) cos "de= ( \ cos de + 
Sii 


a 
{ cos“ j= o a sin |= 
0 


nal . nana . naq 
pa (—sin =F cos ue cos -q sin = + sin 5) = 
nat 


= Aaa 2) a 
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and ee i 
h=- j sin 272 de + f sin" de) = 


— [cos SFY 00s (—nz) + cos 1] = 


= =i | cos 2E cos “5% + sin sin "Tt — (— 1)” + cos "7" —1] = 


nnl nna nnl nna nna 
Į l l 
4 = s, 
= — 5 {9+1 cos t-11} = 


= (1) 1] (1— cos) (n> 1) 


no l 


(Here we have applied the technique which is used for computing 
an integral of any function represented by several formulas.) Thus, 
we have a, = b, = 0 for all odd n and 


ETE ARA 1 kta Ts, 
Oy =F Sin, bacy (1 cos) (k= 4, 2,3, ...) 


for even numbers n = 2k (k = 1, 2, 3, ...). 
L 
The result becomes particularly simple when œ E because 
in this case 


dm = sin kn = 0, bor = -5 (1— cos kn) = 


4 k pry 
i 4—(—1)"] CEA ere) 
and thus the series takes the form 


4 Oe fla. A E AAO 
fa)= aa (Ti r tgn gin E+...) 09 


. Consequently, a function represented by several formulas can be 
expanded in a unique series. The discovery of this fact by Fourier 
was a remarkable event which led to a considerable extension of 
the notion of a function. r 

In applying the theory to practical expansions in Fourier series 
we usually apply formulas of numerical integration (see Sec. XIV.13) 
which are particularly important when the function in question is 
represented by a table or a graph. For instance, suppose that we con- 
sider an expansion in series (107) and want to utilize the trapezoid 
rule by dividing the interval of integration into 24 parts. Then, 


introducing the notation £p Z > Ín = f (e,) (k = 0, 1, . <. 24) 
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we obtain 
l 


baa f jesin dea 
0 


Spil > F z 
oe (sin + frsin pia} oan | aS 


24 \ 2 l l l 
=5 (4 sin 0° n-+ fisin 7.5°n-+ fosin 15° n+ 


4... +2 sin 180°n) (110) 


We see that for any n it is only the following values of the sine that 
are needed 


sin 0° = =0.0000, sin 52.5°=0.7934, 
sin 7.5° =0.1805, sin 60° =0.8660, 
sin 15° =0.2588, sin 67.5°=0.9239, 
sin 22.5° =0.3827, sin 75° =0.9659, 
sin 30° =0.5000, sin 82.5°=0.9914, 
sin 37.5° =0.6088, sin 90° =1.0000 


sin 45° =0.7071, 


When applying formula (110) for a given n we must substitute the 
corresponding values of the sine taken from this table by applying 
the reduction formulas of trigonometry, group together the terms 
having the same second factor, sum up the values of f, in these 
groups and then, after the multiplication has been performed, com- 
pute the whole sum. 

23. Expanding a Periodic Function. Fourier series are used not 
only for expanding a function defined on a finite interval but also 
for functions defined over the whole axis. We first suppose that 
a function f(z) is defined in the interval —n <z <m. Let us 
expand it in series (101). The terms of the series are defined not 
only inside the interval but also outside it and their period is equal 
to 2n (see Sec. 1.16) because 


cos n (x +27) = cos (ns + 2nn) = cos nx 


sinn (x+ 20) = sin (na + 2an) = sin nx (y= Aid 3s neat) 


Hence, the sum can also be continued on the whole z-axis and 
is a periodic function of period 2x. But the sum is equal to f (x) 
on the interval —m SIT <T and consequently series (101) 
extends the function f (x) from the interval ~n <z <x onto the 
whole a-axis with period 2m. 

Similarly, all the terms of series (103) being even functions, its 
sum results from the extension of the function f (z) as an even 
function from the interval 0 < z < 7 onto the whole z-axis with 
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period 2; similarly, the sum of series (104) is the continuation 
of f (x) as an odd function with the same period. An analogous result 
is obtained when we take series (105)-(107) but the period is n:lu- 
rally equal to 21. 

Fig. 340 represents the graphs of the sums of the series consid: red 
in the examples of Sec. 22 which are regarded as being extended 
in the whole z-axis. It should be noted that although 2/ is a period 
for the second example it is not the least period which equals / 
in this case. 

Now let a function f (x) be originally defined over the whole 
x-axis as a periodic function of period 2x. If we take a series of form 
(101) in which the coefficients are computed by formulas (102) then, 
as we have shown, its sum will yield the periodic continuation of 
f (x) from the interval —x <z <x onto the whole z-axis with 
period 2x and thus it will coincide with f (z) on the whole axis. 


The coefficients can also be found by formulas a, = 
a+r a+r 


= = f f (©) dz; an = = f f(z) cos nz dx and b, = 
a-n a-n 
poe 
mina j f (z) sin nz dz (n = 1, 2, ...) where œ is an arbitrary 
ao—m 
20 
number. In particular, we can take the formula a) = = j f (x) dz 


0 

and similar formulas for the coefficients a, and b,. This is implied 
by the periodicity of the integrands (see Sec. XIV.4, property 10). 
_ Similarly, an even (odd) function with period 2x can be expanded 
in series (103) [series (104)]. For a function of period 22 we obtain 
series (105)-(107). 

Expansion (105) is often transformed by means of formula (1.18). 
This results in 


f (2)=a0+ J; Mn sin (= +an) (111) 


L 
n=1 


The constant a is equal to the mean value of the function f (2) 
„ on the whole z-axis (see Sec. XIV.5) since the means of the other 
summands are equal to zero (check it up!). The first variable sum- 
mand in (111) is called the fundamental harmonic; it has the least 
period 2/. The subsequent summands are called higher (upper) har- 
monics. Their least periods are equal, in succession, to ae aS 2 
and so on. Therefore an expansion of a periodic function in a Fourier 
series is referred to as the harmonic analysis. 


SERIES 697 


If the independent variable is interpreted as time it is advisable 
to denote the period by 7 and to rewrite formula (111) in the form 


f (t)=ao+ Dy Mn sin (not +- an) (o= ay) 
nei 

Thus, a Fourier series represents an arbitrary periodic oscillatory 
motion in the form of a sum of harmonic oscillations with multiple 
frequencies. Such expansions are well known in acoustics where the 
fundamental harmonic M, sin (wt + œ) determines the fundamental 
tone and the subsequent harmonics are the overtones which determine 
the tone colour. 

The periods of the summands in expansion (112) are commensu- 
rable with one another, i.e. their ratios are rational numbers. This 
is related to the general property that the sum of periodic functions 
with different periods is periodic if and only if the periods of the 
summands are commensurable. A sum of periodic functions with 
incommensurable periods belongs to a wider class of the so-called 
almost periodic functions which have many applications. In parti- 
cular, they are used for investigating superpositions of non-syn- 
chronous vibrations. 

24. Example. Bessel’s Functions as Fourier Coefficients. A function 
of the form 

eixcost — cos (z cos t) + i sin (x cos £) (4112) 


plays an important role in radioengineering. It is an even periodic 
function with period 2x with respect to ¢ for any fixed x. Therefore 
it can be expanded in a Fourier series of form (103). To obtain the 
expansion we must multiply the series 
ik git 4 pic) at, 1 (ee \3 at, 4 S ast 

e? =1+ (F). +a (=) e +3 (+) e +... 
by the series 
it eit 1 fiz) ua i NA at L N ist 

=t4+4 (Petar (a tar (ez) ote 


e 2 

After the multiplication has been performed we obtain periodic 
function (112) on the left-hand side. Let us combine the terms on 
the right-hand side which contain the same exponential functions. 
To do this we note that we have the coefficient 


1r (4)' +47 (9 TEE, (4) "+ 
it ix \ RH : k 
+i) mama) test lole 


a GEL (4) +a (H=) 
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in e*t and e*t (see Sec. XV.26). Thus, we obtain 
eix cos t — Jy (£) + D E (2) (ett + e™)= 
kt 
=Jo(x)+2 > i*J; (x) cos kt (113) 
k=1 


It is Fourier expansion (413) that we need. Separating the real 
and imaginary parts in (113) we arrive at the expansions 


cos (x cos t) = Jo (x) — 2J (x) cos 2t + 2J, (x) cos 4t —... 
and 
sin (æ cos t) = 2J, (z) cos t — 2J (x) cos 3¢ + 2J; (x) cos 5t — . 


Formula (113) implies, in particular, the integral representation 
of a Bessel function of an integral order: 


T 


In (2) =- \ et cos t cos nt dt 


0 


[check up the result taking advantage of formula (103) for the coef- 
ficients of series (103)]. 

25. Speed of Convergence of a Fourier Series. Let a bounded perio- 
dic function f (x) of period 21 be expanded in Fourier series (105). 
We can easily show that all the Fourier coefficients are bounded 
above in their absolute values by the same positive constant: 


lan] =| aes -= dz|< 
<1 


l 
<7 J1/@)||eos = 
2i 


Ll 
da<+ \|f(x)|de 
EN 


The same result is obtained if the function f (z) is unbounded but 


auoe integrable (summable) on the interval —] < z < l, 
ie. i 


l 
TOCE 
Zh 


Let the function f (x) be bounded and discontinuous. Then its 
Fourier coefficients a, and bp are of the order of = as n— oo (in 
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particular, this is the case in the second example considered in 
Sec. 22) 

Actually, suppose, for definiteness, that f (x) has two disconti- 
nuities in the interval —! < x < l at the points z = z; and x = T, 
and that —l < z, < £, < l. Then we have 

L xi 
dn => j f (2) cos “= dz = + j Í (x) cos = dx+ 
3 1 
xa 1 i 
+7 j f (x) cos“ de ++ f f(x) cos =Z dz 
Xi x2 
Let us integrate by parts each of these integrals: 


an= [fa sin F* — f (—I) sin 


nx (— 1) 
ee 


ees = = Gat+O— 
— f(a 0) sin — EU (et) 
l 
— f(a —0)] sin =) f (2) sin“ de (114) 


l 


The last summand on the right-hand side differs from the Fourier 
coefficient of f’ (z) only in the constant factor in front of the integral. 
l 


But j | f (z) | dz < œ because the derivative f’ (x) retains its sign 


= sos . . 
on each interval (a, $) of monotonicity and continuity of the func- 
tion f (z) and hence 


Fir ola- | 1 @ae|=|10-) 1+ 01 <0 


We have excluded from our considerations functions having an 
infinite number of intervals of monotonicity within a finite interval 
of variation of x (see Fig. 102) because they are rarely encountered. 
We see that under the assumptions concerning f (x) the last integral 
in formula (414) is bounded which implies what has been said about 
the order of smallness of the Fourier coefficients. 

Now let the function f (x) itself be continuous. Suppose that its 
derivative has discontinuities and is bounded (this is the case in 
the first example considered in Sec. 22). Then its Fourier coefficients 


are of the order of aa as n> o. 
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Indeed, taking the expression of a Fourier coefficient and inte- 
grating by parts we obtain 
l l 
1 nmr Age Pires . ne 
an =F j f(a) cos Edr = —— | f (2) sin ax 
=l —l 


Applying the argument of the preceding paragraphs to the inte- 
gral on the right-hand side we conclude that it is of the order of 
= (the same result is 
similarly obtained for bn). If f («) and f' (z) are continuous and 
f” (x) has discontinuities we can perform the integration by parts 
twice and thus prove that in this 
case the Fourier coefficients are of the 


4 2 
= as n> o, and hence a, is of the order of 


1 
order of ae and so on. Hence, the 


order of smallness of Fourier coeffici- 
ents depends on the “smoothness” of 
the function in question, i.e. on the 
number of continuous derivatives it 
possesses. The greater the number, 
the higher the order of smallness of 
the Fourier coefficients, that is the higher the speed of convergence 
of the Fourier series of the function. 

The Fourier series of a discontinuous function f (x) converges 
very slowly and therefore it is difficult to apply it to practical 
calculations. To overcome the difficulty we sometimes try to con- 
struct a function @ (x) having discontinuities at the same points 
as f (z) and the same jumps (see Fig. 344). It is advisable to choose 
a function ọ (z) whose structure is as simple as possible. After 
ọ (x) has been chosen we form the difference f (x) — @ (2) which 
no longer has discontinuities and therefore is expanded in a Fourier 
series whose speed of convergence is higher than that of the Fourier 
series of f T Consequently, we represent f (x) as the sum of a sim ple 
function ọ (z) and a Fourier series with a higher speed of conver- 
gence. We can similarly eliminate the discontinuities of the first 
derivative and so on (compare this with the methods used in Sec. 4). 

Thus, if a function f (z) is continuous and has a bounded deri- 
vative the order of smallness of its Fourier coefficients (as n > oo) 


is not less than that of i But according to Sec. 4 


Fig. 344 


o 
1 
Seas 
n=1 
and hence, by Weierstrass’ test (see Sec. 8), the Fourier series uni- 
formly converges on the whole z-axis. A more extensive investigation 
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shows that the condition which we have imposed on f’ (x) is unne- 
cessary because Weierstrass’ test can be applied under some more 
general assumptions. In the case of a continuous function the sub- 
stitution of any numerical value of x into the series exactly yields 
the corresponding value f (x). 

If a function f (z) is discontinuous its Fourier series by no means 
converges uniformly because its terms are continuous functions 
(see property 4 in Sec. 9). It can be shown that in this case the sub- 
stitution of a numerical value of z into the series results in f (z) 
at all the points of continuity of f (x) and in 


f(z—0) +f (z+9) 
2 


at the points of discontinuity. For instance, series (109) has the 
sum equal to + for z = O since in this case f (—0) = 0 and f (+0) =1. 


If a function defined in a finite interval is expanded in a Fourier 
series (see Sec. 22) the speed of convergence of the series is specified 
by the discontinuities of the function and of its derivatives which 
occur after the function has been periodically extended onto the 
whole z-axis as it was described in Sec. 23. For instance, the Fourier 
coefficients in the first example considered in Sec. 22 are of the 


order of = since the extension (see Fig. 340a) results in a func- 


tion whose first derivative has discontinuities. If the same function 
is expanded into a series in sines [series (107)] the coefficients are 


of the order of 4 because after the extension has been performed 


the function itself has discontinuities (why?). 
Fourier series enable us to find the sums of many interesting 


numerical series. For instance, if we substitute z = $ into series 
(109) we obtain 


4 A E EL AeA IN 4. 5% 
1=44— (gen tgs ty Z) 


which implies 


If we substitute c=0 into series (108) we get 


0=1[4-a(tetetet-)| 


702 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


1 4 1 1 w 
pte at et He 


Further, the last result makes it possible to find the value € (2) 
of the zeta function [see formula (19)]: 


(Q=-t+etatetetet tatart et 
1 4 a 1 
ta (gtatet.)=Ft4r 
which results in 
¢(2) = = 1.645 


26. Fourier Series in Complex Form. Applying Euler’s formulas 
we can pass from a Fourier series containing trigonometric functions 
to a series in exponential functions which is sometimes preferable. 
To perform such a transformation of series (105) we can take the 
formulas 


imnx __ innx 
if l 
AnI e +e 
co = 
AT 2 
and 
imnx __ianx 
l l 
SORNE e =e 
sın na 
l 2i 


After the substitution is performed we combine similar terms on 


the right-hand side and thus obtain a series in which the summation 
ime 


is extended over all the exponential functions of the form e ! 
(n = 0, +1, +2, ...). Thus we obtain 


inny 


j= ee (115) 


n=- 


where c, (n = 0, +1, +2, ...) are some complex coefficients.” 


inna 


To find the coefficients cn we multiply both sides by e ' > 
for a fixed n, and integrate the result from —J to Z. The integrals 


* A series of this type in which the summation index runs from —oo to oo 
is sometimes referred to as a two-way series.—Tr. 
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of the terms with numbers different from n are equal to zero: 


L i . im (m-n) 
immx z imr TEE z 
J cme l e l dr=cm -wan Hl- 
a 
____ ml _join(m-n)— p-ia(m—n)} = 
~ in(m—n) G £ i= 
_ __ ml 5 
= anew sina (m—n)=0 (mn) 


The integration of the term having the number n results in 2len. 
Thus, the coefficients in series (115) are computed by the formula 
1f ime 
c= jie T E ERD) (116) 
=I 

Let us discuss the general case of an expansion with respect to 
an orthogonal system of complex functions. Two complex functions 
g (x) and h (x) are said to be orthogonal to each other on the interval 

ax<zcbdit 


b 
| ele he (@) ae=0 (117) 


where the asterisk designates the complex conjugate function (see 
Sec. VIII.3). In a special case when g and h are real this definition 
coincides with former definition (91). It should be noted that if 
we pass from (117) to the complex conjugate expression in both 
sides we obtain 


b b b 
(J eai a) de)" = | eoi =) ae= J h(a) g*(@) dz =0 

a a a 
which means that the orthogonality condition is independent of 
the order in which we enumerate the functions, that is if g is ortho- 
gonal to k it follows that k is orthogonal to g, and hence we can 
speak about the mutual orthogonality of the functions. 

If we have a system of complex functions 


aE) Bo (ts een Bn (tr oe (118) 


which are orthogonal on an interval a < x <b and if a complex 
function f (x) can be expanded into a series in these functions [this 
is always the case when system (108) is complete] we obtain 


f (x) = cigi (2) + c82 (2) + +++ F enBn (2) = 2i Cngn (x) (119) 
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To determine the coefficients we multiply both sides by gk (x) for 
a fixed z and integrate the result from a to b. By the orthogonality 
condition, only one term on the right-hand side will be different 
from zero and hence we obtain the formula 


b b 
§ f (2) sh (2) dz \ f (2) gi (2) dz 
a a 


es == S 
r= = 


v b 
\ &n (2) gù (2) dx § gn (2) |? dz 


a 


Expansion (1415) is a particular case of (119) when the system 
of functions 
_ idm inx inxs i2nx 
se A RA EDE T C ee 


(which is complete on the interval —} < 7 < l) is taken as system 
(148). The expansion is valid for any bounded function and even 
for an unbounded complex function f (x) which is absolutely inte- 
grable over the interval (see Sec. XIV.16). 

27. Parseval Relation. Let us return to real functions. Take an 
orthogonal and complete system of functions on an interval a < 
Ratt ee 


Pa Cle mea rau 8, (2)s.- +, - (120) 


Let us consider the expansion of an arbitrary function f (x) into 
a series in these functions. Squaring both sides of the expansion 
and integrating the result from a to b we arrive at the integral 


b 
| P@ de 


a 


on the left-hand side. We shall suppose that the integral has a finite 
numerical value. After the right-hand side is squared we obtain 
the sum of the squares of the terms and the products of different 
terms taken pairwise. The integrals of the latter are equal to zero 
according to the orthogonality of functions (120) whereas the inte- 
grals of the former are different from zero, and thus we have 


b œ b 


\ f(x) dx= Ñ a f gh (x) dx (124) 


a n=1 a 


This formula is referred to as the Parseval relation (Parseval 
theorem). In the particular case of a Fourier series of form (104) 
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we obtain 
rid o 
| P @)dr=2aa +2 Dati) 
-7 n=1 

and for series (105) we get 


= n=1 


l œ 
| P@) de= 2a +1 > (a+) 
l 


The relations were found in 1805. By the way, they directly imply 
that a, > 0 and bn —> 0 as n —> œ. 

If a system of type (120) is incomplete it is possible to prove that 
it can be completed. After the completion, relation (121) becomes 
true. But all the terms on the right-hand side are non-negative and 
hence if a system of orthogonal functions is not complete we have 
the inequality 


æ b b 
Darl gà (e)da< | P(o) de 
n=1 a a 


for any function f (z). The sign of equality occurs here only for 
those functions f (z) which can be expanded with respect to system 
of functions (120).* 

Parseval’s equality (121) enables us to apply a new approach to 
constructing a series in orthogonal functions. Let us be given a finite 
number of functions 

gi (3) 82 (Hs +--+ Bn (2) (122) 
which are orthogonal on an interval a < x < b. We now pose the 
following problem: it is required to form a linear combination 
of functions (122) whose mean square deviation from a given func- 


tion f (x) (see Sec. 7) is minimal. 
To solve the problem let us consider functions (122) to be a part 
of a complete orthogonal system of form (120). Then 


TOS 5 Crgr (2) = Š angr (2)— >) Crgr (2) = 
k=1 k=1 k=1 


= > (an —Cn) 8a (£) + È, angr (2) 


where ap are the Fourier coefficients of the expansion of f (x) with 
respect to functions (120) and C, are arbitrary constants. By equality 


* The last relation is referred to as Bessel’s inequality.—T7r. 
45—0444 
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(121), we can write 
b n a n b 
J [10 X Cren (2) | dz = J (an— Cr)? | sk (@) de+ 
a k=1 k=1 a 
CJ b 
+ > ak | gh (x) dx 


k=n+1 a 


The last formula indicates that if the coefficients C4, Cy, ..-, Cn 
are varied in an arbitrary way the minimal value of the right-hand 
side is attained when 


Ci =å; Cag, a -ss Cn = An 


Thus, to obtain a linear combination of a fixed number of ortho- 
gonal functions (122) whose mean square deviation from f (£) is 
minimal we must take the corresponding partial sum of the expan- 
sion of f (z) into a series with respect to the system of orthogonal 
functions (120), that is the linear combinations with the coefficients 
defined by formulas (100). 

An analogous argument applied to a complete orthogonal system 
of complex functions of type (118) and to series (119) leads to the 
formula 


b œ b 
f ir) Pac= Jilan} | len (2) Pdz (123) 
a n=1 a 

if we take into account the equality aa* = |a |’. 


28. Hilbert Space. We now return to real functions defined on 
a finite interval a < z <b. It turns out that there exists a far- 
teaching analogy between such functions and vectors which we 
mentioned in Sec. 20. Since we can perform linear operations on 
the functions according to ordinary algebraic rules the functions 
form a linear space in the sense of the definition given in Sec. VII.17. 
Moreover, if we introduce the notion of a sealar product of two 
functions by means of the formula 


b 


H e= J f(2) g(2) de (124) 


we can readily verify that all the axioms enumerated in Sec. VII.20 
are fulfilled and thus we obtain a space which is not only linear but 
Euclidean as well. We include in this space all the bounded functions 
and also all the unbounded functions with integrable (summable) 


a a a 


. i _— z SOA mann alm = 


a 
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squares, that is such functions-f (x) that 
b 
H N= | P@)dr<o (125) 
a 


The simple inequality 2 | f(z) g (z) | < f (x) + g° (z) implies that 
integral (124) of a function satisfying condition (125) is convergent 
(provided it is an improper integral). 

The set of functions satisfying condition (125) equipped with 
scalar product (124) is called the Hilbert space L, after the famous 
German mathematician D. Hilbert (1862-1943). This is an infinite- 
dimensional Euclidean space (see Sec. VII.20). By definition (124), 
condition (91) is nothing but the orthogonality condition for vectors 
belonging to this space. A complete orthogonal system of functions 
is an orthogonal basis in the space L. It should be noted that when 
applying the notion of a complete system of functions (see Sec. 21) 
to the space L, we must interpret the convergence of series (99) 
as the convergence in the mean square (see Sec. 8). Thus, the con- 
vergence in L, is the convergence in the mean square. Formulas (100) 
of the coefficients of an expansion are a particular case of formulas 
(VII.29) and the orthogonalization process described in Sec. 20 
is nothing but a realization of the process discussed in Sec. VIT.21. 
Further, Parseval’s equality (121) in the Hilbert space L, is ana- 
logous to Pythagoras’ theorem: the square of the diagonal of a rec- 
tangular parallelepiped is equal to the sum of squares of all its 
dimensions. (We suggest that the reader try to interpret geometri- 
cally the property of partial sums of series (99) which, as it was 
proved in Sec. 27, minimize the mean square deviation.) 

A characteristic feature of a Hilbert space is that it is infinite- 
dimensional. This property makes it difficult to test the completeness 
of an orthogonal system of functions. A system of k pairwise ortho- 
gonal nonzero vectors belonging to an n-dimensional Euclidean space 
is complete if k =n and incomplete if k< n. In contrast to it 
an infinite orthogonal system containing infinitely many functions 
belonging to an infinite-dimensional space may not be complete. 
Thus, the number of functions does not enable us to answer the 
question whether a given orthogonal system is complete inthe 
case of an infinite-dimensional space. This is a rather difficult pro- 
blem and we shall not consider it here. 

It can be proved that any incomplete orthogonal system is a part 
of a complete orthogonal system. Therefore a function can be expan- 
ded with respect to the former if and only if its expansion with 
respect to the latter has zero coefficients in those functions of the 
system which are not contained in the former. 

In Sec. VII.20 we proved that for Euclidean spaces there is an 
important inequality of form (V 1.26). Applying it to a functional 


45% 
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space and squaring we deduce the inequality 


b b 


b 
(| Fœ gade) < (f Pæ dz) (f eeds) 


a a a 


Substituting |f | for f and 1 for g into the inequality we get 


b b 


(f 1f@laz)'<@—a | P(o dz 


a a 


This implies that all the functions belonging to L, are summable 
and that convergence in the mean square implies convergence 
in the mean*. The converse may not be true. By the way, it 
is possible to prove that the Fourier series of any summable func- 
tion converges to the function in the mean. 

29. Orthogonality with Weight Function. When we integrate 
a function f (x) over an interval a < z < b all the values of z be- 
longing to the interval are equivalent. But if we want to stress 
the importance of a certain value of z in comparison with the others 
we introduce a weight function (weighting function) p (z) > 0 when 
performing the integration: 


f (2) p (x) dx 


wn 


a 


The function p (x) is so chosen that its values should be greater for 
the values of x which are considered to be more important. 

Two functions g (x) and h (x) are said to be orthogonal with weight 
function p (z) on an interval (a, b) if 


b 


| (2) gap (e) ae =0 


a 


The whole theory of series in orthogonal functions presented in 
Secs. 20, 21, 26 and 27 is directly extended to functions orthogonal 
with weight function; to do this we must simply introduce the 
factor p (x) under the sign of integration in all the formulas. 

An integral involving a weighting function p (x) can be easily 
transformed to an integral without a weighting function (i.e. with 
the weighting function p = 1) by means of change of independent 


* Convergence in the mean is understood here as-convergence in the 
miean of order one; see footnote on page 663.— Tr. 
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variable: 


x 


p(x) dz = dt, z= J p(2)dr=z(2), Z(a)=a and Zx(b)=6 
‘s (126) 


Indeed, if we introduce the notation f (x) = f (« (z)) =f (x) we 
obtain 


b 
| Fa p(a)ae= | F ae 


A m OT 


Under such a transformation any system of functions orthogonal 
with weight function p turns into a system of functions orthogonal 
in the sense of our former definition (Sec. 20). But nevertheless it 
is sometimes convenient to consider functions orthogonal with 
he t function without transforming them according to formula 

26). 

Conversely, change of variable (126) enables us to pass from any 
orthogonal (in the ordinary sense) system of functions to a system 
orthogonal with weight function p. Such a transformation yields the 
corresponding transformation of the expansions in series and there- 
fore a complete system is transformed to a complete one. For instance, 
taking the complete orthogonal system of functions 


1, . cos. 2), EOS AA A COS AAi . O<z<n) 
[see system (94)] and performing the transformation 


ee - 1 
x= n —are COS T, diya N ENSTI 


we obtain the functions 


To (x) = 1, T, (£) = cos are cos x = T, 

T, (x) = cos (2 arc cos 2) = 22 — 1, 

T, (x) = cos (3 arc cos z) = 42° — 3z,..»; 
Tn (z) = cos (n arc cos z), .-. 


after an inessential change of the signs has been made (check it up!). 
These polynomials were introduced by Chebyshev in 1857. They 
are referred to as Chebyshev’s polynomials. We see that they form 
a complete system of functions orthogonal with weight function 
ve = on the interval —1 <2<1. The polynomials can also 
-r 
be obtained from system of functions (97) by means of the ortho- 
gonalization process with this weight function (let the reader per- 
form the transformation!). 
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To eliminate a weight function we can also apply the following 
method: if we are given a system of functions ; 


FL) pat a OE eM (9 e (427) 


orthogonal with weight function p (z) on an interval a < z < b 
the system of functions 


gi(z)Ve(@), g) VO ---. gn(z)Ve(z), --- (128) 


is orthogonal without weight function on the same interval (check 
it up!). To expand a function f (x) in a series with respect to system 
(127) it is sufficient to expand the function f (x) Vp (a) with respect 
to system (128) and then cancel out the factor Vo (2). 

30. Multiple Fourier Series. When expanding a function of several 
independent variables in a series we usually take a system of func- 
tions dependent on several indices whose number equals the number 
of the arguments. Then we arrive at multiple series as in Sec. 17. 

The theory of multiple series with respect to systems of orthogonal 
functions is developed by analogy with the theory of ordinary series. 
For the sake of simplicity, we restrict ourselves to the case of func- 
tions of two arguments. In this case functions forming a complete 
orthogonal system in a domain D must depend on two indices, 
that is have the form @mp (£, y) where m and n assume some 
discrete values. A series with respect to such a system is of the form 


f(z, y= x GmnPmn (2, Y) 
m,n 
where the symbol 5} denotes the corresponding two-fold sum. The 
m,n 
coefficients are found after a manner of Sec. 24: 


\\ f (@, Y) Pn (2) y) dz dy 


i) Phan (£, y) dx dy 


ann = 


(129) 


There is a method of constructing a system of orthogonal functions 
of several arguments on the basis of some given systems of functions 
of one argument. Let us be given two complete orthogonal systems 
of functions of one independent variable: 

go) Be @)s oe Bn (a Se <3) 
and 

hr (ays ha yy a cy ee oe ee) 
Then the system of functions 


Qmn (2, Y) = Em (z) hn (y) (m, n =1, 2, ...) (130) 
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is orthogonal and complete in the rectangle I: a<z <b, c S 
<y<d. 

The orthogonality of the system is implied by the relation 

b a 

j f Qmn (z, y) Pin (z, y) dx dy = f dx f 

I a c 

i 

c 


b 
= ( Í gm (2) gz (2) ar) (bn Wha 0) dy) 


Em (2) hn (Y) gz (2) ha (u) dy = 


which always yields zero except the case when we simultaneously 
have m = m and n = n. The completeness can also be easily proved. 
Namely, given an arbitrary function f (x, y) defined in II, we can 
expand it with respect to functions pn (y) for any fixed z: 


Fa, = 3) An (2) hn) 


where the coefficients of the expansion are dependent on 7. Now we 
can expand these coefficients A, (z) with respect to the functions 
gm (£) which results in a double series with respect to system of 
functions (430): 


fæ, v= È, Š anni () hn) 


Thus we have obtained what we set out to prove. Hence the expan- 
sion is possible. 

Taking two systems of functions of form (95) defined on extended 
intervals 0 <z S l4 and 0 < y < l we can form a complete ortho- 
gonal system of functions on the corresponding rectangle: 


. mnie. nay 
mn (2; y) = Sin = — Sin (m, n=1, 2, .--) 


An expansion with respect to this system has the form 


sene Sy Dd) msn sin O<2<h, 0<y<h) 


m=i n=1 
The coefficients in the series are found on the basis of formula (129): 


u l2 
4 à s 
ann =e ) dz | f(z, y)-sin s sin a dy 
0 0 


(verify the result!). 
31. Application to the Equation of Oscillations of a String. Fourier 
series have many applications in the theory of equations of mathe- 
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matical physics. Here we shall give an example of such an appli- 
cation to the problem of solving the equation of small free trans- 
verse oscillations of a taut string.* 

We shall consider the string to be weightless. Suppose the string 
has a finite length /. Let us draw the z-axis along the string in its 
equilibrium state so that the ends of the string have the coordinates 
x = 0 and z = l, respectively. We shall study plane oscillations 
and denote by u (x, t) the transverse deflection of the point of the 
string with abscissa x at the moment ¢ from the equilibrium state. 
In the theory of equations of mathematical physics it is proved that 
the function u (x, t) satisfies the following partial differential equa- 
tion of the second order: 


Ou o Bu 4 
L Se (131) 


Here a is a constant (a = Vs where 7 is the tension of the string 


and p = const is its linear density). We shall consider the ends of 
the string to be fixed. Then the corresponding boundary conditions 
expressing this fact can be written in the form 


Ulz=o = 9, Uw lear =O (for all 2) (132) 
We shall also suppose that the deflections and velocities of the 


points of the string at the initial moment of time t = 0 are known. 
Then we can write the initial conditions of the form 


who= (2), Z|, = v2) (133) 


where ọ (x) and p (x) are some given functions. Hence, we arrive 
at the following mathematical problem: it is required to solve 
rhe: (131) for boundary conditions (132) and initial conditions 


To solve the problem let us look for the sought-for solution in 
the form of a series of type (107) for each fixed ¢> 0. Then the 
coefficients will be dependent on ¢ and thus we get 


w(x, t)= J) bn (t) sin (134) 


n=1 


To find the coefficients bn (£) we substitute this expression into 
equation (131). This results in 
= Tne 


o 
” : ae nn? . nng 
kop 7 =—a Y dn (2) p sin = 
n=1 


n= 


* Equations of mathematical physics (and, in particular, the application 
of Fourier series to solving the equation of oscillation of a string) are treated in 
greater detail in the Appendix at the end of this English edition.— Tr. 
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that is 
a?n?n? 


bn (t) = —— p bn (t) 


Solving this ordinary linear differential equation with constant 
coefficients by applying the methods given in Sec. XV.17 we 
derive 


bn (t) = An cos —— t+ B,sin — t 
By (134), it follows that 
ED (An cos — t+ B, sin = t) sin T x (135) 
n=1 


To determine the coefficients A, and B,(n=1, 2, 3, ...) we take 

advantage of initial conditions (133). This yields 

p(t)= J Ansin s, v= Dj Ba am sin g (<<) 
esl n=1 

(check up the result!). We have arrived at Fourier expansions 

of form (107) from which we find 


i 
Si TnT 
I amn f pesing dr 
0 0 
Substituting these quantities into (1 35) we thus obtain the sought-for 
solution. The boundary conditions are automatically satisfied here 
because of the properties of the functions sin (eS ae) 


which satisfy conditions (132). 


An== | Osin mdz and B= 2 


§ 5. Fourier Transformation 


32. Fourier Transform. Let us take formula (115) which repre- 
sents any finite function f(x) defined over an interval —l<a<l 
in the form of a complex Fourier series. This representation 
involves complex harmonics, i.e. the functions e? with wave 


numbers k = k, where 

kn== (n=...) —2, —1,0, 4, 2, -.-) (136) 
‘The set of these numbers (it is depicted in Fig. 342) is called the 
spectrum of wave numbers. It is discrete, that is it consists of separate 
points to each of which there corresponds a harmonic et#r® in expan- 
sion (115) with complex amplitude cn. The quantity cn (n = 0, +14, 
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+2, ...) is defined by the formula 


L 

1 i c 9 

en =- | f (eye Meda (137) 
-=l 


which is implied by formula (116). 
For the sake of simplicity, we first suppose that the function 
f(x) identically vanishes outside a finite interval a < z < b. 


Re hy hy k hg ok 


Ohh 0 te 
oa a 
Fig. 342 


Ib)=s | fee ae (138) 


This integral is in fact taken only over the interval a < z < b. 
If J is sufficiently large we can rewrite formula (137) of a Fourier 
coefficient in the form 
b o 
A l 4 5 r 
en= om) fae eds= se | f(x)e-* det — f (ka) Ak (139) 


a —0 


where Ak = + is the distance between the neighbouring points 


representing the wave numbers in the spectrum [see formula (136) 


and Fig. 342]. Formula (115) representing f (z) can therefore be 
rewritten as 


f(a)= > cnetkn® — x Î (kn) eta Ak (—l<a4<l) (140) 


Now suppose that / is very large. Then the spectrum becomes very 
“dense” and Ak very small. In the limit, as 1 oo, sum (140) which 


is an integral sum turns into the corresponding integral, i.e. we 
obtain 


f(z)= f Î (k) e" dk (—0 <t< 00) (141) 
In this representation the wave number k runs over all the values 


ranging from —oo to oo, i.e. the discrete spectrum of wave numbers 
turns into a continuous spectrum in the limiting process as | > ©. 
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The first equality (139) shows that the amplitudes cn > 0 as 

1 > oo. This means that in the limit each harmonic has a zero ampli- 

tude. But at the same time if we take an infinitesimal (but different 

from zero) interval of wave numbers between k and k + dk we 
obtain the amplitude 

de = f (k) dk (142) 


corresponding to the interval. Hence, the amplitude turns out to 
be distributed over the whole continuous spectrum of wave numbers, 
in the limit. This resembles the transition from a discrete model 
of a material body to its continuous model. Actually, in this transi- 
tion we assume that the mass of each separate point is equal to 
zero and thus the total mass becomes continuously distributed over 
all the points with a certain density. By analogy, we can say that 
formula (142) describes the distribution of amplitudes of harmonics 
with density f (k). Thus, f (k) is the density of the amplitude on an 
infinitesimal interval of wave numbers. The density is related 
to unit measure of length of the interval [f (k) is also referred 
to as the spectral density of the function f (z)]. 

Formulas (138) and (141) express the so-called Fourier transfor- 
mation*. Formula (138) defines the direct transformation and for- 
mula (141) the inverse transformation. We have deduced the for- 
mulas under the assumption that the function f (æ) is identically 
equal to zero outside a finite interval. Such functions are called 
finite**, A more extensive investigation shows that the formulas 
remain valid when integral (138) is understood as an improper inte- 
gral. For the integral to be convergent, it is sufficient to impose 
the additional condition 


o0 


f RAl2) | dx < œ% (143) 


oo 


Thus, to each function f (x) satisfying condition (143) there corres- 
ponds its Fourier transform f (k) [whichis the result, image, arising 
from the Fourier transformation applied to f (z)], the transformation 
being defined by formula (138). Conversely, the function f (x) is 
expressed in terms of its Fourier transform by formula (141) and 
is called the Fourier inverse transform [i.e. the inverse image, pre- 


7 

* The expression on the right-hand side of (141) is also called the Fourier 
integral. — Tr. 

+* The term a “finite function” should not be confused with the term a “bound- 
ed function” whose range is contained in some finite interval (although we 
sometimes say “finite” instead of “bounded”). To avoid the confusion we can 
use the term “a function of finite support” when speaking about functions identi- 
cally vanishing outside an interval, the term taken from functional analy- 
sis.—Tr. 
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image, which is the result of the Fourier inverse transformation 
performed on the function f (A). 

If f (z) is an even function we can use property 9, Sec. XIV.4, 
when computing integral (138), which yields 


oo 


im=— j j (æ) cos kz dx — 5+ į f (x) sin kz dx = 
=< J j (2) cos ke dx 


0 


It follows that Î (k) is also an even function in this case and hence, 
on the basis of formula (141), we obtain 
f(a) =2 j } (k) cos kz dk 
0 
These formulas define the so-called Fourier cosine transform and 


its inverse image. Similarly, if f (x) is an odd function we arrive 
at the formulas defining the Fourier sine transform and its pre-image: 


Ñ= | f(e)sinkede, f(2)=2 f ij (k) sin kz dk 
0 0 


By the way, in the last case the term “Fourier sine transform of 
a function f(a)” is usually applied to the function if (k) instead 
of f (k). 

If a function f (æ) is originally defined on the positive semi-axis 
0< x< oo it can be extended onto the interval —oo < £ <0 
either as an even or as an odd function. Therefore considering both 
x and k to be positive we can use the cosine transform as well as 
the sine transform. But the images of these transformations will 
be different in the general case. 

Let us take an example. Suppose f (x) is an even function equal 
to 1 on the interval —1 <a < 1 and identically vanishing outside 
it. Then by the formula of the Fourier cosine transform we get 

: © 


1 
fh) = (| 1.coska dx + | 0-cos kz dz) 2i 


Applying the inverse transformation we obtain 


f (0) =2 | EF cos ke dk (144) 
0 
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As in the case of a Fourier series (see Sec. 25), if we substitute nume- 
rical values of z into the formula of the inverse Fourier transform 
we get the corresponding values of f (a) at all the points of continuity 


of f and the values ey (c — 0) + f (z + 0)] at all points where f 


has a finite jump. In particular, substituting the value x = 0 into 
(144) for which the function f is continuous we get 


1=2 | snt gk, which implies eas tse, 
0 


An integral formula equivalent to the formulas of the Fourier 
transforms was obtained by Fourier in 1811. 

33. Properties of Fourier Transforms. A Fourier transform possesses 
many useful properties. We are going to enumerate some of them 
here. First of all, it is clear that a Fourier transformation can be 
interpreted as an operator (see Sec. XIV.26) for which the function 
f (z) is the inverse image (pre-image) and the function f(k) is the 
image. 

1. The Fourier operator is linear, that is 


oe 5 MEN a 
Giti=h+h, af=af (a= const) (145) 


This is directly implied by formula (438) and by the fact that the 
integration is a linear operation. 

Formula (145) implies, in particular, that if f depends not only 
on z but also on a parameter ¢ the function Î is dependent on the 
parameter as well, and we have 


A = n 
fiyat fi F fipa ft 
AEDE At 


Passing to the limit, as At—>0, we obtain 


Consequently, the derivative with respect to a parameter of the 
Fourier inverse transform is transformed into the derivative with 
respect to the parameter of the Fourier transform. By the way, 
the method of proving the property indicates that this property is 
common to all the linear operators. 

9. The differentiation of the function f with respect to x results 


in the multiplication of its Fourier transform f by ik. Indeed, the 


. 
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Fourier inverse transform of the function f' (x) is 


1 Ç , —ikx 1 F —ikx 
| fem ae—zz | eM ate) 


co 


aa | 0) (ik) ee de 


x00 


=e" F (0) 


x=— o0 
=% 


(we have performed integration by parts here). But condition (143) 
indicates that f (oc) = 0 and therefore the first summand on the 
right-hand side vanishes. The second summand is equal to 

t | joet de =ikf (k 

E | pae da =ihf (h) 
which is what we set out to prove. 

Transforming formula (441) in a similar way we can prove that 
if the function f is differentiated its Fourier inverse transform f 
is multiplied by —ik. 

3. If the function f (k) is the Fourier transform of the function 
f (z) the function 4 f (4 (a = const > 0) is the Fourier trans- 


form of f (az). Actually, performing the change of variable az = s$ 
we obtain: 


jiwanta [rotata 


Therefore, if the graph of the Fourier inverse transform is stretched 
a-fold along the z-axis, the graph of the Fourier transform is con- 
tracted a-fold along the k-axis and vice versa. This means that we 
cannot simultaneously localize (that is concentrate at a certain 
point of the corresponding axis) both a function which serves as 
a Fourier inverse transform and its spectral density. This is the 
so-called uncertainty principle which has many applications in 
physics. 

4. If the function f (x) is shifted by P = const along the a-axis 
(i.e. its graph is shifted by the distance f), its Fourier transform 
is multiplied by e~‘6*. In fact, making the substitution £ — p=s 
we obtain 


“a f f (x—B) ett dr = ae f f(s) e™*e-ihB ds = e-inBf (k) 


Conversely, if the transform is shifted by ĝ along the k-axis the 
inverse transform is multiplied by ethe, 
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5. Parseval’s Theorem. If we apply formula (123) to series (115) 


we obtain 
l 


| Ie) Pdz=2 > enf 

Br n=—0o 
Taking advantage of formula (139) we deduce from the last relation 
the equality 


l co 
| | f(a) P de=21 J) 17 Gin) P (At = 27 $i 1} n) P Ak 

=F n=—00 n 
Passing to the limit as l —> co we obtain, by analogy with Sec. 32, 
the relation 

| 170P dzr=27 | 1} (0) pak 

It is this relation that is called Parseval’s theorem for the Fourier 
transform. 

34. Application to Oscillations of Infinite String. The Fourier 
integral transformation is applied to solving some problems of 
mathematical physics for infinite media. We shall illustrate the 
solution of equation (131) in the case of an infinite string, i.e. when 
—oco <x < œ. Hence, there are no boundary conditions in this 
problem and it is only the initial conditions that define the sought- 
for solution. For the sake of simplicity, let us put p (z) = 0 in 
the initial conditions (133). We denote by u (k, t) the Fourier trans- 
form of the solution u (z, t) for any fixed value of ¢ > 0. Passing 
to the Fourier transforms of the left-hand and right-hand sides of 
equation (131) and taking advantage of properties 1 and 2 in Sec. 33 
we obtain 

Ge T S N ERY y 7 T 
mE (ik) u = —@ku 
For any fixed k, this is an ordinary linear differential equation with 
constant coefficients which can be solved by means of the standard 
methods given in Sec. XV.17. Thus we obtain 
a= C;, (k) et +-C, O ef! (146) 
Now we take the Fourier transforms of the initial conditions 
(133): A 
~ a ĝu 
u li=0 = @ (k) and ale =0 


Consequently, formula (146) results in 
al 


u e A ọ (k) eir 1G (k eiath 
2 2 


720 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


(verify the calculations!). On the basis of properties 1 and 4 in 
Sec. 33, we now return to the inverse image: 


w= 50 (eal) +g O(e + at) 


It is the last formula that yields the sought-for solution. The 
meaning of the formula is quite simple: the initial deflection from 
the equilibrium state is divided into two equal parts; one of the 
parts is shifted at the moment ¢ by the distance at in the positive 
direction of the z-axis whereas the other is shifted by the same dis- 
tance in the opposite direction. In other words, the two waves propa- 
gate along the string with the speed a in the positive and negative 
directions without changing their initial form. At each moment of 
time we watch the result of the superposition of the waves. Thus, 
we see what is the physical meaning of the constant a entering 
into equation (134): it is equal to the speed at which an initial 
perturbation propagates along the string. 


CHAPTER XVIII 


Elements of the Theory 
of Probability 


§ 1. Random Events and Their Probabilities 


1. Random Events. The theory of probability deals with random 
events. The notion of an event is a basic one, and it is rather difficult 
to give its comprehensive definition. 

For the aims of our course it will be sufficient to regard as an event 
everything that may or may not occur when a certain set of con- 
ditions is realized. Every realization of this kind is called a trial. 
For instance, when tossing a coin we can consider the fact that it 
shows heads to be an event. In this case tossing the coin serves as 
a trial. We can regard it as an event when an article randomly se- 
lected from a lot containing a number of manufactured articles 
turns out to be defective. In this example sampling a unit from the 
lot is a trial. But a trial, as it is understood in the theory of pro- 
bability, must not necessarily be connected with human activities. 
For example, if we consider it to be an event that it will rain in 
a certain place on a certain day, the fact that this day comes should 
be regarded as a trial. 

A characteristic feature of a random event is that it may not 
necessarily occur when a trial is realized. This distinguishes a ran- 
dom event from a deterministic one which inevitably occurs. The 
randomness of an event is connected with the fact that many con- 
comitant factors which are essential for the outcome of a trial may 
not be given. The incompleteness of information can sometimes 
be intrinsic (for instance, in games of chance or in warfare) or can 
result from the inaccessibility of some kind of information at the 
present level of the development of science (for example, in problems 
of weather forecast). The assumption that the outcomes of individual 
trials cannot be predicted is taken as a basic principle in quantum 
mechanics, genetics and some other sciences. Besides, there are 
some cases in which the exact prediction of the outcomes of certain 
trials is possible but not advantageous when it requires unnecessary 
expenditure connected with additional precision measurements and 
the like. 


46-0444 
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Regularities of random events appear in mass-scale phenomena 
when trials are repeated a large number of times. For instance, we 
cannot predict the result of a single toss of a coin because it can 
come up heads or tails. Nobody will find it very strange if we have 
heads twice when we toss a coin ten times. But if we get only 200 
heads after the coin has been tossed 1000 times we have every reason 
to say that there is something wrong with the coin or with tossing. 
Indeed, if the conditions are equal neither heads nor tails have 
any advantage and therefore they must appear approximately 
equal number of times. Of course, when tossing an “honest” coin 
1000 times we may not necessarily have heads exactly 500 times; 
we can have them 490 or 525 times or so but not 200 times! Similarly, 
if we examine a single unit selected at random from a lot we cannot 
have a good judgement on the quality of the lot. This can be done 
only after a sufficiently large number of repeated trials have been 
performed or, as we say, when we have sufficiently large sample 
size. Thus, specifying what was said at the beginning of this section, 
we can say that the theory of probability deals with random events 
which occur in mass-scale phenomena when the corresponding set 
of conditions is realized a large number of times. 

There are two possible ways of understanding the repetition 
of trials. For instance, we can toss one and the same coin 1000 times 
but we can also independently toss 1000 similar coins at different 
instances of time or even simultaneously. Both possibilities are 
siye equivalent, and further we shall not distinguish between 
them. 

2. Probability. In everyday life we often say that a certain event 
is highly probable whereas some other event is improbable. Of 
course, in case the corresponding trials can be repeated many times 
these assertions mean that the former event will occur frequently 
and the latter will occur seldom. 

An important feature of the theory of probability is that it not 
only indicates that the probability of an event is high or low but 
also attributes an exact numerical value to it. Hence, the probability 
of an event is considered to be a numerical value which characterizes 
the frequency of the occurrences of the event in a large number of 
repeated trials. 

Suppose that a coin was tossed 1000 times and that it came up 
heads 490 times. Then the ratio a = 0.49 is said to be the relative 
frequency of the coin coming up heads in the given series of trials. 
Let the coin be tossed 10,000 times and let heads appear 5027 times; 
then the relative frequency is equal to 0.5027. It is clear that if 
the coin is symmetric and if the number of trials increases the rela- 
tive frequency of heads must approach 0.5 because neither of the 
faces of the coin has any advantage over the other. It is the number 
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0.5 that is called the probability of heads when the coin is being 
tossed. 

In the general case the definition is stated similarly. Denote 
a random event by A. Suppose that the event occurred N4 times 


in a series of N independent trials. Then the ratio ua is called the 


relative frequency of the event A in the given series of trials. 
The limit 
ahi 
a soe 
to which the relative frequency of the event A tends when the number 
of trials is increased unlimitedly is called the probability of the 
random event A. 

Hence, if the number of trials is sufficiently large the relative 
frequency of an event can be approximately taken as its probability. 
This fact implies a method of empirical calculation of probabilities 
when it is difficult to find them theoretically. Let us consider an 
example. The probability that a new-born child will be a boy is 
known with a great accuracy from the statistics of human popula- 
tion. The probability equals 0.512 although from time to time there 
appears a deviation from this value. Knowing this probability we 
cannot predict whether a new-born child will be a boy or a girl 
for each concrete case. We can only say that the probability of the 
child being a boy is a little higher, i.e. the birth of a boy is a little 
more probable. But nevertheless we can assert that the number of 
boys among a million of new-born children will be close to 512,000 
(in Sec. 20 we shall work out some methods for estimating the degree 
of this closeness). 

There are some cases when the probability can be calculated by 
means of figuring the number of favourable trial outcomes (favourable 
cases). We shall illustrate this method by taking a concrete example. 
Suppose that we toss a die on whose faces sums of points from 1 to 6 
are marked. Let it be necessary to find the probability of a throw 
giving a sum of points divisible by 3. Imagine that we have made 
a large number N of throws. Let N} denote the number of occurren- 
ces of k points (k = 1, 2, 3, 4, 5 and 6). Then we have Ni + No +... 


. « wate V4 that, as A “PAY fy +8 = 1. But neither 
of the faces having any advantage over the others, all the six frac- 
tions are approximately equal to S if N is sufficiently large, and 
in the limit they become exactly equal to each other when N —> oo. 
_ Hence, in the limit the fractions are equal to + . But a sum of points 


is multiple of 3 if it equals three or six, and hence the number of 
46* 
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favourable outcomes is equal to N3 + Ne. The relative frequency 
of the event is equal to 
ee ee ae 

This is just the sought-for probability. The result thus obtained 
can be formulated briefly as follows: we have six possible outcomes 
in throwing a die which correspond to the range of possible sums 
of points. There are two outcomes among them which are favourable 
to the event in question, namely throwing three and six. The other 
cases are unfavourable. Hence, the probability of the event equals 

1 
=z: 
The general scheme of such calculations can be described in the 
following way. Suppose that a trial can result in exactly one of n 
possible outcomes, these outcomes being equally probable (in such 
circumstances we also say that we have n equally probable possible 
cases). Let us consider an event A which occurs when q of these 
outcomes appear and does not occur when the other n — q outcomes 
appear. We call these g outcomes favourable to the event A whereas 
the other n — q cases are called unfavourable to A. Then, reasoning 
as in the preceding paragraph, we conclude that 


p{4j=+ 


6 


Thus, the probability of an event is equal to the ratio of the num- 
ber of trial outcomes favourable to the event to the number of all 
possible outcomes. 

Using the scheme of favourable cases we can easily find the pro- 
bability of winning a prize for an owner of a lottery ticket (for 
this purpose the total number of prizes should be divided by the 
number of tickets). Many other similar probabilities can be found 
in a similar way. In performing such calculations we must be sure 
that the outcomes of trials are equally possible, i.e. equally pro- 
bable. For instance, it would be wrong to reason in the following 
way: the sum of points obtained when tossing the die can be either 
divisible by three or indivisible and therefore there is one favourable 
case among the two for getting a sum multiple of three, and hence 


‘the sought-for probability equals = (where does the mistake lie 


in this argument?). Of course, we always idealize reality when we 
consider some outcomes to be equally possible, and therefore this 
assumption only approximately holds in concrete problems, with 
a certain accuracy. Such an approach is justified only when there 
is symmetry in trials under consideration. 

There are various modifications of the scheme of figuring the 
-number of favourable cases. Let us consider an example illustrating 
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one of the variants of the method. Suppose we have a homogeneous 
spherical ball with surface area S. Let a portion of the surface having 
an area So be blackened. Let the ball be thrown at random on a hori- 
zontal plane and let it be necessary to calculate the probability 
that the ball strikes the plane with the portion So of its surface. 
To perform the calculation we imagine that the whole surface is 
divided into small parts of equal areas dS. Then the point of impact 
can belong to any of these parts with an equal probability. But the 


total number of these parts is equal to = , and the number of the 
parts belonging to the portion So is equal to Se. Hence, there are 


= cases favourable to the event in question among the total number 

S Dors nA ae 
of cases as? and thus the sought-for probability is equal to ass 
= za If we pose the same problem for an ellipsoid instead of the 


ball the solution will depend not only on the area but also on the 
disposition of the blackened portion on the surface of the ellipsoid. 
The solution involves integration, and we leave it to the reader. 

In conclusion we note that in everyday life the term “probability” 
is sometimes applied to such events which cannot be repeated, even 
mentally. For instance, we sometimes speak about the probability 
of whether there exists life on Mars and the like. In such cases it 
would be better to speak about estimating the likelihood of a hypo- 
thesis. The likelihood theory is not thoroughly developed at present. 

3. Basie Properties of Probabilities. 

1. The probability of any event A is a dimensionless quantity 
whose numerical value lies between the limits 0 and 1: 

+ 


0<P{A}<1 


The property immediately follows from the definition of probabi- 
lity given in Sec. 2. The definition also indicates that the greater 
P {A}, the greater the possibility of the occurrence of the event, 
i.e. the greater its probability understood in everyday sense. 

2. The probability of a certain (sure) event, that is of an event 
which unavoidably occurs, is equal to unity. Thus, we regard a cer- 
tain, deterministic event as a special case of a random event (this 
resembles our arguments in Sec. 1.5 where we considered a constant 
quantity to be a special case of a variable quantity). Further, the 
probability of an impossible event is equal to zero. 

In the case of a finite number of possible outcomes of trials the 
converse assertions are also true. Namely, if the probability is equal 
to unity (zero) the event is certain (impossible). But in the general 
case these assertions are no longer true. For instance, our discussion 
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in Sec. 2 shows that the probability that a ball thrown at random 
will strike a plane with a point which is set beforehand is equal 
to zero. At the same time such an event is not impossible (theore- 
tically) because its occurrence does not contradict the laws of mecha- 
nics. But of course the event is practically impossible. 

3. The sum of probabilities of any event and its opposite event 
is always equal to unity. We say that two events are contrary or 
opposite to each other if the occurrence of one of them is equivalent 
to the non-occurrence of the other. In other words, each of the two 
contrary events is the negation of the other. If the probability of 
hitting a target under certain conditions is equal to 0.2 the proba- 
bility of failing to hit the target under the same conditions is equal 

_to 0.8. To prove this assertion in the general case we denote two 
contrary events by A and A. Let N trials be made and let the event A 
occur Na times and the event A occur Nz times. Then it is evident 


N= 
that Na + Nx =N which implies “4 + ~ = 41: Passing to the 
limit for N —> œ we find that 


PAVE PA} = 4 (1) 


4. We can similarly prove a more general assertion: if a trial 
results in the necessary occurrence of one and only one event belong- 
ing to a group of events A;, As, ..., Anr we have 


P{A} +P {4} +...+P {Any} =1 (2) 


5. We now consider two events A and B such that each of them 
may or may not occur when one and the same trial is performed. 
Suppose that N such trials have been made. Let Va anag designate 
the number of trials in which both events occurred and let N4 ana B 
be the number of trials in which A occurred and B did not occur 
and so on. Using this notation we can write 


N= Na ands = Na anaB x9 Ni and B ite NA andB 
Besides, for the total number of trials in which the event A occurred 
and for the total number of trials in which the event B occurred 
we can write 


Na =Naands+Naanae and Ng=Naanas + 
+ Ni aind B (3) 


Furthermore, let us denote the number of trials in which at least 
one of the events A and B occurred by Na org. Then we have 


Naors = Na ands + NaandB + NZ anag (4) 
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Formulas (3) and (4) imply 


Naorp _ Na t Ng Naanap 
N N N N 


Passing to the limit, as N —> œ, we arrive at the formula 
P {A or B} =P {A} +P {B} —P {A and B} 


in which the sense of the notation is quite clear. 

In particular, if the events A and B are mutually exclusive, that 
is such that they cannot occur simultaneously, we obtain the follow- 
ing theorem of addition of probabilities (addition rule of probability 
theory): 


P {A or B} = P {A} + P {B} (where A and B 
are mutually exclusive) 


The following more general rule is proved in a similar way: if 
the events Åi, Ao, ---, Ar are pairwise mutually exclusive we have 


P {Ay or Az... or An} = P {Ai} + P {Ap} +... +P {Az} 
(5) 
4. Theorem of Multiplication of Probabilities. Let A and B be 
two events. Then the conditional probability P {A | B} of the event 
A relative to the hypothesis that the event Bhas occurred is the pro- 
bability of the event A calculated on the condition that the event B 
has taken place. Therefore, when calculating this probability by 
means of the corresponding relative frequency (see Sec. 2), we must 
take into account only those trials whose outcomes resulted in the 
occurrence of the event B: 
pe Naand B 
nepo pe Tape 
For instance, suppose we are given two urns. Let the first urn 
contain three black balls and one white ball and the second one 
contain one black ball and three white balls. Suppose that we ran- 
domly select one of the urns and draw a ball from it at random. 
What is the probability that the ball will be black? The obvious 


symmetry of the possible outcomes indicates that P {Abrach} = + 


where Abrach is the event consisting in the occurrence of a black 
ball. We now suppose that it is known that we have selected the 
first urn. Let us denote this event (that is selecting the first urn) 
as B, Then it is apparent that the conditional probability 


3 
P (Anach | Bi} = | - 
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Taking the simple formula 


Na ana B ZENB NA and B 
Nee N Ne 


and passing to the limit, as N > oo, we obtain the following mul- 
tiplication rule of probability theory: 


P {A and B} = P {B} P {A |B} = P {A} P {B | A} (6) 


(the last expression has been obtained by interchanging the roles 
of A and B). 

Thus, the probability that two events take place simultaneously 
is equal to the product of the probability of one of them by the con- 
ditional probability of the other provided that the first event has 
occurred. 

Formula (6) becomes especially simple when the events A and B 
are independent. We call two events independent if any information 
concerning the occurrence or non-occurrence of one of them does 
not affect the probability of the other. Thus, in this case we have 


P {A |B} =P {A},. P {A | B} = P {4}, 
P{B|A} =P {B}, P{B|A} =P {B} 


(By the way, on the basis of equalities (1), (5) and (6), we can easily 
conclude that each of the above relations implies the other three.) 
Formula (6), for independent events, turns into 


` P{A and B} = P {A} P {B} (7) 
(where A and B are independent) 


Formula (7) can be readily extended to the case of an arbitrary 
number of independent events, that is events such that the infor- 
mation concerning the occurrence or non-occurrence of any group 
of these events does not affect the probabilities of the others. For 
example, if A, B and C are independent events we have 


P {A and B and C} = P {A and (B and C)} = 
= P {A} P {B and C} = P {A} P {B} P {C} (8) 


_ The above rules enable us to calculate probabilities for some 
simple problems. Let us take an example. Suppose there are three 
shots and each of them fires at a target once. Let the first of them 
hit the target with a probability of 0.2, the second with a proba- 
bility of 0.3 and the third with 0.5. What is the probability that 
the target will be hit at least once? If we denote the probability 
of kth shot hitting the target as A, (k = 1, 2, 3) we can say that 
we are interested in the probability P {A; or As or As}. We cannot 
apply formula (5) here because the events A, are not mutually 


ELEMENTS OF THE THEORY OF PROBABILITY 729 


exclusive since the target can be simultaneously hit by two or three 
shots. Therefore in this case it is easier to calculate the probability 
that all the shots miss the target because these opposite events are 
independent. Thus, by formulas (1) and (8), we obtain 


P {Ay or A> or A} =1—P {Ay and A» and A3} = 
= 1 —P {Ay} P {Aj} P {As} = 1 — 0.8 x 0.7 x 0.5 = 0.72 


Let us consider one more example. Suppose that we randomly 
draw two balls in succession from an urn containing three black 
balls and one white ball. What is the probability that both balls 
will be black? There can be two variants of the problem. Namely, 
we can consider sampling with replacement. In our case this means 
that we consider drawing a ball with replacement which means 
that the first ball drawn from the urn is replaced in the urn after 
its colour has been noted and before the next drawing is made. 
Hence, the case when the same ball will be drawn a second time 
is not excluded here. Formula (7) is obviously applicable here, 


and thus we see that the sought-for probability is equal to 7-7 = 


EREL But if we consider drawing without replacement, that is if 


the first ball drawn from the urn is not replaced in the urn after 
its colour has been examined, then this ball does not take part in 
the second sampling, and therefore the sought-for probability is 


calculated by formula (6) which yields 3.22 te 

5, Theorem of Total Probability. We shall begin with an example. 
Let there be three urns. The first urn contains three black balls 
and one white ball, the second contains one black ball and three 
white balls and the third only three black balls. Suppose we random- 
ly selected one of the urns (with equal probability) and then drew 
a ball from the urn at random. What is the probability of the ball 
being black? If we drew a black ball this obviously indicates that 
we either selected the first urn and drew a black ball from it or 
did the same with the second or with the third urn. All these three 
variants are pairwise mutually exclusive. By formula (7), the pro- 


bability of the first variant taking place is equal to ULE the pro- 
bability of the second variant is equal to aie and the probability 


of the third one is equal to zt. Hence, the probability of the 
occurrence of one of the variants is equal, by formula (5), to 


This is just the sought-for probability. 
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Now let us turn to the general case. Suppose that the result of 
a trial is the occurrence of one and only one of the k events B4, Bo, ..- 

. ., Ba which are pairwise mutually exclusive (in the previous 
example the role of such events was played by the selections of one 
of the urns). Besides, let us consider an event A (in the above example 
the role of A was played by the drawing of a black ball). We can 
regard A as being equivalent to the event consisting in the occurrence 
of B, and A, or of B, and A, or of B; and A and so on. All the 
last variants being mutually exclusive, formula (5) implies 


P {A} =P {(B, and A) or (Bz and A)... or (Br and A)} = 
= P {B, and A} + P {Bz and A} + ... + P {Bx and A} 


From this, by formula (6), we finally deduce 


P {A} = P {A | By} P {By} + P {A | B} P {B} +... + 
+ P {4A | By} P {Ba} (9) 


This formula is called the formula of total probability (partition 
formula). It can be applied to problems similar to the one consi- 
dered in the foregoing paragraph. 

6. Formulas for the Probability of Hypotheses. We begin with 
the above example of the three urns again. Suppose that we know 
the distribution of the balls in the urns and that the urns themselves 
are indistinguishable. This means that when we select one of the 
urns at random we do not know which of the urns has been selected. 
Then, considering the three hypotheses that we have selected the 
first urn or the second urn or the third one we conclude that they 
are all equally probable, that is the probability of each of the hy- 


potheses is equal to = Now let us draw a ball at random from 


the urn we have selected and let the ball turn out to be white. Then 
we should reappraise the probabilities of the hypotheses. For in- 
stance, after the drawing of a white ball, it becomes clear that the 
urn we have selected cannot be the third one and that it is more 
probable that we haye selected the second urn than the first (why?). 
The probabilities calculated before the performance of the experi- 
ment (i.e. before drawing a ball) are called a priori probabilities, 
and the reappraised probabilities are called a posteriori probabilities 
(the term a priori originates from Latin and means presumptive, 
and a posteriori is the reverse of a priori). Now, how can we find 
these reappraised probabilities? 


Let us take the general case. Let there be several hypotheses 


Hy, H», ..., Hy, and let it be known that one and only one of them 
holds. Let the a priori probabilities of the hypotheses be equal to 
P {Hy}, P{H:}, ..., P {Hy}, respectively. Suppose that the 


conditional probabilities P {A | H;} (i = 1, 2, ..., k) of an event 
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A relative to each of the hypotheses are known. Then the a posteriori 
probabilities of the hypotheses are nothing but the probabilities 
P {H; | A} (C =4, 2, k). To calculate them we write, on 
the basis of (6), the relations 


P {A} P {H; | A} = P {Hi} P {A | Hi} 


and then, applying formula (9), deduce 
P{A| Hi} P {Hi} 
> a 
P {H| A} = pap Ay P (H) PUA | Aa} Pa} + «PAT Aa} P iHa) 


i=1,...,k 


These are the sought-for formulas for the probability of hypotheses 
(Bayes’ theorem). Let the reader apply the formulas to verify that 
the reappraised probabilities in the problem considered in the pre- 


ceding paragraph are equal tot $ $ and 0, respectively. 


7. Disregarding Low-Probability Events. We see that the methods 
of the theory of probability enable us to calculate the probabilities 
of various events. How can we utilize these results? One can hardly 
be satisfied with the answer that a given event will either occur 
or not. 

There is an approach to the problem which is typical of applied 
mathematics. It is based on the idea that if the probability of an 
event A under consideration is sufficiently small, that is if P {A} < 
< e where e is a sufficiently small positive number, we can appro- 
ximately put P {4} = 0 and thus consider the event A to be prac- 
tically impossible. In sucha case we simply disregard the possibility 
that A may occur. Of course, this does not exclude the theoretical 
possibility of the occurrence of A, and therefore the prediction that 
‘A will not occur may turn out to be wrong. But the smaller e, the 
rarer the occurrences of the event. 

But how can we choose £? There are various traditions concerning 
this question in different divisions of applied mathematics. If 
there is nothing dangerous in the occurrence of the event A, that is 
if the error introduced by the incorrect prediction can be easily 
corrected, we can put e = 0.1. This means that in the long run 
approximately 10 per cent of predictions will be false. But if a 
higher reliability is not connected with essential difficulties we 
usually put e = 0.04. For instance, if we toss a coin 100 times the 
meaning of the choice of € = 0.04 is that we disregard the possibi- 
lity of such events as the coin coming up heads seven times in suc- 
cession because 27 ~ 100. For still more accurate predictions we 
can put € = 0.004; then the average frequency of incorrect predic- 
tions will be about one per thousand and so on. The smaller e, the 
more accurate the prediction. But at the same time it is more diffi- 
cult to guarantee such an accuracy when s is decreased. The accuracy 
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should be particularly great if an incorrect prediction may be con- 
nected with casualties. In such cases we sometimes cannot rely 
upon probabilistic inferences, and then we have to resort to deter- 
ministic ones. 

In Sec. 14 we shall discuss some methods of choosing criteria 
(i.e. choosing e) according to which events can be considered to be 
practically impossible. 


§ 2. Random Variables 


8. Definitions. A random variable is a. variable quantity which 
randomly assumes a certain numerical value resulting from the out- 
come of a trial. This value depends on chance and, generally speak- 
ing, varies as the trials are repeated. 

Examples of random variables are the number of students atten- 
ding a lecture, the length of a manufactured article taken from 
a lot, the duration of life of a person and so on. 

Like every quantity (see Sec. I.5), a random variable can be 
diserete or continuous. For instance, the first random variable in 
the above examples is discrete whereas the other two are continuous. 
It is essential here that even before a trial has been made we know 
that the possible values of the number of students are integral, where- 
as it is impossible to set beforehand the possible discrete values 
of the lengths of the articles. 

To obtain a representation of a discrete random variable we can 
enumerate all its possible values and indicate the probabilities 
with which these values are assumed. This results in a table of the 
following form: 


DISCRETE RANDOM VARIABLE § 


values of € | zy 


(10) 
probabilities | Py | P | P3 | BE WL oe & 


Such a table can be finite or infinite (theoretically). It is apparent 
that all the probabilities P} must be non-negative and their sum, 
according to formula (2), must be equal to unity. In the special 
case when there is only one possible value it must be assumed with 
probability 1, that is the variable in question necessarily assumes 
this value. Thus, in such a case we have a deterministic quantity. 

A continuous random variable Ẹ can assume all the numerical 
values or all the values belonging to some interval (or to a system 
of intervals). But the probability that such a random variable will 
exactly take on any value x set beforehand is equal to zero. (This 
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situation is similar to that of a continuously distributed mass when 
the mass of any separate point is considered to be equal to zero.) 
But we can speak about the probability that the random variable E 
will assume a value belonging to a given interval of the z-axis. 
The probability of the value of § to fall in an infinitesimal interval 
from z to a + dz is also infinitesimal; it is directly proportional 
to dx and depends on x. Hence, this probability is equal to an expres- 
sion of the form p (z) dx where p (x) is the so-called probability 
density function (the density of probability distribution or the fre- 
quency function of the probability distribution). This function 
completely characterizes the random variable &. Apparently, we 
always have p(z) >0 (—œ <2#< oo). Formula (5) shows that 
the probability that the value assumed by the random variable § 


belongs to an interval a < £ < b is equal to fp (x) dx. Hence, 
a 


by formula (2), there must be 
j p(z)dz=1 d1) 
-%0 
The expression p (x) dx is called the element of the probability 
distribution. Henceforward we shall write integrals taken from 
co to co without indicating the limits of integration; for instance, 
formula (11) will be put down as | p (2) dr =4. This will not 
lead to any misunderstandings because we shall not deal with inde- 
finite integrals in this chapter. (By the way, the sign f is rarely 
used in mathematical applications for denoting indefinite integrals. 
It usually designates definite integrals for which the limits of inte- 
gration are implied by the corresponding physical or mathematical 
meaning of the integrals. For instance, we sometimes mean that 
designates definite integrals taken over maximal ranges 
of variation of the corresponding variables of integration.) 
Using the notion of the delta function (see Sec. XIV.25) we can also 
introduce the probability density function for a discrete random 
variable. For instance, if such a random variable is represented by 
a table of form (10) we have 


PA = Pb AA) + Pde — 2) + P e a +... 


Delta functions also enable us to consider the density of probability 
distribution for a random variable of a mixed (continuous-discrete) 
type. In the general case the representation of a random variable 
is equivalent to the construction of a non-negative measure in the 


the sign f 
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straight line (see Sec. XVI.19) such that the total measure of the 
whole straight line is equal to unity. Then, such a probabi- 
lity measure being given, the probability that the value of the ran- 
dom variable will fall in an interval is equal to the measure of the 
interval. 

9. Examples of Discrete Random Variables. We now consider 
a random event A with P {A} = P. Suppose that we have made 
one trial. Then how many times can the event A occur? Evidently, 
this number is equal either to 1 or to 0. Hence, we have obtained 
a random variable which can assume only two values, namely the 
value 1 with probability P and the value 0 with probability 4 — P 
[see formula (1)]. 

Now let the trials be performed several times. For definiteness, 
let there be three trials. Suppose that the event A has occurred v 
times in these trials. Then v can be regarded as a random variable 
whose possible values are 0, 1, 2 and 3. Let us compute the proba- 
bilities of these values. If v = 0 the event A does not occur in all 
the three trials. The trials being independent, the probability that 
v = 0 can be found by formula (8). This results in the probability 
equal to (1 — P)’. The value v = 1 can be obtained in the follow- 
ing three variants: the event A occurs in the first (second or third) 
trial and does not occur in the other two trials. The probability 
of each variant is again found by formula (8) which yields the result 
P (4 — P)?. Therefore, according to formula (5), the probability 
that we shall have one of the variants is equal to 3P (1 — P}. 
The cases v = 2 and v = 3 are treated similarly, and thus we arrive 
at the following table: 


0 | 4 | 2 | 3 


(—P)3 | 3P (1—P)2 | 3P2 (1— P) | ps 


values of v 


probabilities 


(Let the reader check up that the sum of the probabilities thus ob- 
tained is equal to unity!) 

The general case of n trials is investigated in like manner. Let 
again v be the number of the occurrences of the event A. Then v 
is a random variable for which we obtain the table 


values of v | 0 | 1 | 2 | SN. 


y 


probabilities | (t—P)n| (7) P(l—P)" 


(3) map| 
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Here (3) ae is a binomial coefficient which is 


equal to the number of possible cases in which the event A occurs 
exactly k times, i.e. it is equal to the number of combinations 
of k elements from n. The set of probabilities collected in the above 
table is called the law of binomial probability distribution (or, brie- 
fly, the binomial distribution). 

Example. A coin is tossed six times. What is the probability 
that it will come up heads exactly three times? 


6\ 4 1\3 5 
Answer: ( 3 ) a (1 —5) =i 
Now let us investigate the behaviour of the binomial distribution 
when n, the number of trials, is very large whereas the probability 
of the event A is very small so that there is a relation Pn = æ 
where œ is a constant. For this purpose we pass to the limit in the 
formula 


P(v=h}=(_) PP)" = 
ea n(n) oer) (2) a-y 


n 


(where P {v = k} designates the probability that v = k) as n> ©. 
This results in the limiting formula 


P= e (k=0,1,2,-..) 


The calculations connected with the deduction of the formula are 
left to the reader. Thus, we arrive at a random variable which can 
assume infinitely many different values. The probability distribu- 
tion thus obtained is illustrated by the following table: 


values of v 
probabilities 


This is the so-called Poisson law (Poisson distribution) named after 
S. Poisson (1781-1840), a French mechanician, physicist and mathe- 
matician. 

An example of a random variable distributed according to the 
Poisson law is the number of atoms of a certain mass of some slowly 
disintegrating radioactive substance which decay during a suffi- 
ciently long time interval so chosen that it should be possible to 
observe the disintegrations of separate atoms. Then the Poisson law 
is in fact applicable because under these conditions the decay of 
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an atom is independent of the disintegrations of other atoms and 
all the atoms disintegrate with equal probability. There is a number 
of other similar examples. 

10. Examples of Continuous Random Variables. One of the simp- 
lest examples is a random variable which is uniformly distributed 
over an interval a <x < b, that is which can assume all the values 
belonging to the interval with equal probability and does 
not assume the values lying outside the interval. The probability 
density of such a variable is put down in the form 


c for a<v<b 
BAe) = 0 for «<a and for <>b 
Condition (11) implies that ¢ = — The graph of this function 


is shown in Fig. 343. A uniformly distributed random variable is 
sometimes said to have a rectangular distribution (Fig. 343 illustra- 
tes the origin of the name). For instance, the round-off error which 


Fig. 343 


results from rounding the numerical value of a quantity to its nearest 
integer is a uniformly distributed random variable, and we have 
= —0.5, b = 0.5 and c = 1 in this case (why is it so?). 
The most widely spread random variables are distributed accor- 
ding ‘to the so-called normal law (Gaussian law). The density func- 
tion of such a random variable is expressed by the formula 


p (2) = Me- #9? = M exp [—B (z—a)?] 
where «, M >0O and p >0 are some numerical parameters. The 


parameter M can be easily expressed in terms of B. To achieve this 
we must take formula (11) and substitute s = Vp (x — a) in it. 


Then, using integral (XIV.72), we deduce M = pe (let the rea- 
der verify the calculations!). For our further aims ina Sec. 15) it 
will be convenient to introduce the notation B = es and to put 
down the expression of p (x) in the form: 


pe aa | S| (12) 
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The graph of density function (12) is shown in Fig. 344. In § 4 we 
shall discuss why the Gaussian law is so widely applied. 

The probability that a random variable € distributed in accord 
with law (12) falls in an interval 
a<«x<b is equal to 1 


i F 
Vao 
b 
—q)2 
X j exp [ eS | dx (13) 
Š Fig. 344 


The above probability can be 
easily expressed by means of the probability integral 


SA-KE j exp (-+) ds 
ù 


for which there are extensive tables (for instance, see [23], [44] and 
[48]. Indeed, substituting pee ante (18) we obtain the 


expression 


ple) 


P{a<t<}= 


x 


4 b—a —a\~ 
=719 (Os) (14) 

141. Joint Distribution of Several Random Variables. We shall 
confine ourselves to the case of continuous random variables. More- 
over, for simplicity’s sake, we shall consider a system of two variab- 
les. Discrete variables and systems of more than two variables are 
investigated in a similar way. : 

Let us simultaneously consider two random variables Ẹ and y 
which take on certain numerical values in one and the same trial. 
Then the probability of § falling in an interval between z and z + dx 
and ņ falling between y and y + dy should be proportional both 
to dz and to dy. Hence, this probability is equal to an expression 
of the form p (x, y) dx dy. The function p (x, y) is referred to as the 
probability density (frequency function) of the joint (simultaneous) 


471—0141 
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distribution of the random variables § and y. This function comple- 
tely characterizes the pair of random variables £, ņ. Obviously, 
the function must satisfy the conditions 


p(z, y)>0 and | az | pia, y)dy=1 


If the probability density of the joint distribution of two random 
variables Ẹ and y is known we can easily find the probability den- 
sity of each of the variables § and ņ (the densities of the so-called 
marginal distributions of Ẹ and n). Actually, formula (5) implies 
that the probability that & will assume a value lying between x 
and z+ dz when y can have an arbitrary value is equal to 


P{r<t<etda}= | p(z, y)dzdy= (J p(x, y)dy) dz 
y=- œ 
It follows, by Sec. 9, that the density of the probability distribu- 
tion of the variable Ẹ is a function pg (x) of the form 


P; (z)= j p (x, y)dy 


We similarly deduce the expression 


Pa(y)= J p (e, y)dz 


for the probability density of ņ. But the converse transition from 
Pe (x) and py (y) to p (x, y) may be impossible in the general case, 
that is it may be impossible to restore p (x, y) knowing only p; (x) 
and py (y), because here an essential role is played by the “interac- 
tion” between the variables € and n. 

There is an important special case when p (x, y) can be obtained 
on the basis of pz (x) and py (y). This is the case when the random 
variables € and 1 are independent, i.e. when any information con- 
cerning one of them does not affect the probability of a numerical 
value assumed by the other. In this case formula (7) implies that 


p(x, y)dz dy =P {x<ci<r+dz, y<yn<y+dy}= 
=P {r<E<x+ dx} P {y<n<y-+ dy} = p; (x) dz py (y) dy 


P (2, Y) = P; (2) Pn (y) (15) 


Conversely, if condition (15) holds we can show that the random 
variables — and y are independent. 

There is a more general notion of a multidimensional random 
variable related to systems of random variables. Such a variable 
takes on the values which are the elements of a multidimensional 
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space (R) (see Sec. X.2). The law of probability distribution of 
a random variable of this type is represented by a non-negative 
measure defined in (R) (see Sec. XVI.19), the measure of the whole 
space (R) being equal to unity. The measure of a region belonging 
to the space is nothing but the probability that the variable in 
question falls in the region. If this measure in (R) is such that it 
is possible to perform differentiation with respect to it (for instance, 
such as the Lebesgue measure in a finite-dimensional Euclidean space 
mentioned in Sec. XVI.19) then, differentiating, we can pass to the 
probability density (see Sec. XVL.7). If we introduce generalized 
coordinates t,, ts, . . -, ta in (R), we can consider the set of the coor- 
dinates of the element of the space (R) which represents the multi- 
dimensional random quantity in question instead of the quantity 
itself. Thus we come to a system of several random variablés having 
a probability density of their joint distribution. 

As a simple example illustrating what has just been said, we 
consider the n-dimensional normal (Gaussian) law. This is the law 
of probability distribution of a random vector § in the space En 
(see Sec. VII.18) whose probability density is of the form 


p (x) = M exp (—x*Ax) (46) 


where A is a positive-definite symmetric matrix (see Secs. XII.44 
and XII.7) and M is a normalization factor so chosen that the inte- 
gral of p (x) taken over the whole space should be equal to unity. 
‘As is known from Sec. X1.11, the quadratic form x*Ax can be redu- 
ced to a diagonal form by means of introducing a new Cartesian 
basis in En. Hence, after the basis has been introduced, frequency 
function (16) is transformed to the form 


p(x!) =M exp (— Mz — Mata — e — Anin) = 
= M exp (— M24") exp (— ows?) ... exp (— Anan) 


where M, Ao,» -- An are the eigenvalues of the matrix A. This en- 
ables us to easily find M=V Mhz - .- Ann"? Besides, formula (15) 
indicates that the coordinates of the random vector with respect 
to the new basis are independent random variables. 

12. Functions of Random Variables. If ņ = f (&) where § is a 
random variable, y is also a random variable. Besides, if § is dis- 
crete (continuous), y is also discrete (continuous). If § is represented 
by a table of form (10) then, generally speaking, 4 will assume the 
value f (a) with the probability pı, the value f (a) with the proba- 
bility px and so on. But at the same time we must take into account 
the fact that if f (x) = f (xj) (where i j) then, of course, the cor- 
responding probabilities p: and p; are added together. For instance, 
if & takes on the values —2, —1, 0, 1 and 2 with the same probabi- 

47* 
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lity +, it follows that &? takes on the values 0, 1 and 4 with the 


corresponding probabilities oe and 2 (why?). 

If n = f (Ẹ) and ¢ is a continuous random variable with the 
probability density pz (x) then in the case ņn = f (§) is an increasing 
function of & its probability density pn (y) is expressed by the for- 


mula 
1 4 
Pn (y) =a P (SNY + dy} = qy P (E2 + da} = 


pane __ PE (2) 


where x entering into the right-hand side is found from the equation 
f (x) = y. If the function f (Ẹ) is a decreasing one | f’ (x) | should 
be substituted for f’ (z) into the right-hand side. Finally, if the 
function f (€) is a non-monotone one, the right-hand side should 
be replaced by the sum of analogous expressions, the summation 
being extended over all the solutions of the equation f (x) = y. 

We can similarly investigate functions of several random variah- 
les. For example, let us take a function of the form € = f (&, n) 
where the pair of random variables Ẹ, ņ is characterized by the den- 
sity of their joint probability distribution p (z, y). Then 


Pr(@)=GePe<t<ztaj}== |È ple,ydedy (17 


f(x, y)<z 


In the general case the derivative entering into the right-hand side 
of (17) is computed according to the rules given in Sec. XVI.18. 

We shall illustrate the above result by applying it to calculating 
the probability density pz (z) of the sum 6 = Ẹ + n of two inde- 


pendent random variables Ẹ and y. By formulas (15) and (17), we 
obtain 


pe(2)= te | È Pele) poly) dx dy = 
x+ys<z 
=a jar | rs) paty)dy=Z S| h pny) ay |v; (ey de = 


a f [+ | Pn (y) dy | p; (z) dx = f Ps (2) Pn (2—2) dx 


=0 


(let the reader verify the calculations!). 
As an exercise, we suggest that the reader should prove that the 
sum of two.independent random variables each of which is uniform- 
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ly distributed over the same interval 0< z< 1 has the probabili- 
ty density of the form 


0 for z<0 and r>2 
p(2)= az for 0<ar<i1 
2E g for 4<2<2 


§ 3. Numerical Characteristics of Random Variables 


43. The Mean Value. Let there be a discrete random variable Ẹ 
represented by a table of form (10). Suppose that a great number N 
of trials have been performed. What will be the arithmetic mean 
of the values & thus obtained? To answer the question we denote 
by N; the number of the outcomes of the trials in which € has assu- 
med the value z;. Then the sought-for arithmetic mean is equal to 


N. Notz +N. Er N Nz , N. 
Nis ENa rA E at Cana oss yap + tet NT AEI 


But as we know from Sec. 2, we have vt, P; when N —> oo. Hence, 
in the limit, we obtain the expression 


xP, + GP. + t3P3+--- (18) 


It is referred to as the mean value (mathematical expectation or, 
briefly, expectation or centre of distribution) of the random variable 
=. This is one of the most important characteristics of &. The mean 
ofẹ is usually designated as $ or M {§} or ME. It should be noted 
that the mean value of a random variable is no longer a random 
variable but is a deterministic quantity. (For instance, verify that 
the mean value of the sums of points obtained in throwing a die 
is the constant number 3.5.) 

Formula (48) can be obviously generalized for the case when & 
is a continuous random variable with the probability density p (x): 


= Di aldP = >) ap (2) dr = f zp (x) dx (19) 


[We have put down the summation sign to stress the analogy bet- 
ween formulas (48) and (19); of course there must be the sign of 
integration here which has been written in the last expression en- 
tering into (19).] Let the reader verify, by means of the formula, 
that the means of the varia bles considered in Sec. 10 are, respecti- 
vely, at? and a. But these results are obviously implied by the 


symmetry of the distributions. 
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14. Properties of the Mean Value. 


1. The definition implies that the mean value € of a random variab- 
le Ẹ has the same dimension as € and that it lies between the greatest 
and the least possible values of &. 

2. If we multiply a random variable by a constant (i.e. by a con- 
stant deterministic quantity) its mean value will be multiplied by 
the same constant: M {CE} = CM {Ẹ} (C = const). This follows 
from Sec. 13 because the multiplication of all the values by a con- 
stant yields the multiplication of the arithmetic mean value by the 
same constant. The next property is proved in a similar way. 

3. The mean of the sum of two random variables equals the sum 
of their means M {E+} =M {E} +M {n} In particular, if 
a constant is added to a random variable, the same constant is added 
to its mean value. 

By the way, applying the last property we can readily find the 
mean value of a random variable Ẹ distributed according to the 
binomial law (see Sec. 9). Let us consider independent random va- 
ciables &,, Eo, ..., & which take on the value 1 with the proba- 
bility P and the value 0 with the probability 1 — P. Apparently, 
we can interpret £; as a variable indicating the number of the occur- 
rences of an event A in the ith trial, the probability of A being P. 
Then the variable in question can be represented in the form 


GoSacbeica ct 1. - ct ten (20) 
(see Sec. 9 where an analogous variable was denoted by v). From 
(20) we obtain M {&} = M {&} + M{&} +- - - + M {Ẹn} = nP. 
This result is directly implied by the meaning of the variable, and 
we could have guessed it without applying the above calculations. 
It follows that for a Poisson distribution (Sec. 9) we have M {Ẹ} = a. 


_ 4. The mean of the product of two independent random variables 
is equal to the product of their mean values: 


M {Ey} = M {E} M {n} if Ẹ and y are independent 
Indeed, if §Ẹ takes the values x; with the probabilities pi and y takes 


the values y; with the probabilities g; then ën assumes the values 


xiy; with the probabilities p;g; since — and y are ind dent (see 
Sec. 4). Therefore ar § “N independent ( 


M {En} = a tiy; (Didi) = È (3 iY jPi9j) = 


= > tipi pa yiqi = M {$} M {n} 


Properties 3 and 4 are immediately extended to an arbitrary num- 
ber of summands and factors. It should be noted that the condition 
that the factors should be independent is essential for property 4. 
If the condition does not hold the property no longer remains true 
in the general case. For example, if we square a random variable, 
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i.e. multiply it by itself, then, as a rule, the mean value of the square 
does not equal the square of the mean: & = (&)?. For instance, in the 
example considered at the beginning of Sec. 12 we have M {£} = 0 
but M {£2} = 2 (check it up!). 

5. If a random variable & assumes the values which are placed 
symmetrically with respect to a constant a with equal probabilities, 
then Ẹ = a (this is obvious). 

6. If a random variable § is represented by a table of form (10) 
it follows that f (§) = Sf @) pi- If a continuous random variable 

a 


E has the probability density p (x) we have fE = f f (x) p (x) dx 


(this immediately follows from the definitions). 

In particular, the calculation of the mean of a random variable 
enables us to set a criterion according to which a random event 
can be considered to be practi- 
cally impossible (that is to set 
a certain value of the quanti- 
ty e mentioned in Sec. 7). Here 
we shall give only some simple 
considerations concerning this 
question. Suppose we have agreed 
that the random events whose 
probabilities are less than a cer- 
tain value s are disregarded, i.e. 
they are considered to be practi- 
cally impossible. But there can 
be an incorrect prediction, and 
this means that an event which 
is regarded as impossible may 
nevertheless occur. Let the loss 
connected with the incorrect pre- 
diction be equal to an amount 
k expressed in certain monetary 
units. Then the average (mean) Hig 3 
loss will be equal to ek. It is 1B. 


therefore desirable to decrease &, 

but at the same time the perfection of the predictions also involves 
some additional expenditure. Let us designate the cost of a trial 
which can guarantee a prediction “accurate to e” as f (e). In many 
concrete problems such a function can be approximately found. An 
example of the graph of a function of this type is represented in 
Fig. 345. Hence, the average loss connected with incorrect predic- 
tions equals f (e) + ke, and thus the value € = &o which is to be 
set as a criterion must þe chosen so that this sum should be mini- 


mized. 
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In conclusion we shall give several remarks concerning multidi- 
mensional random quantities (see Sec. 11) which assume their values 
in a finite-dimensional linear space (R) (see Sec. VII.17). Thus, we 
shall speak about finite-dimensional random vectors. The formula 
of the mean value of such a random variable is similar to (19): 


Ea f xdP = j xp (x) dR 
(R) (R) 


where the integration is extended over the whole space (R) and dP 
is the differential of volume (the element of volume) in (R). All 
the properties of the mean in this case are analogous to those of the 
scalar (one-dimensional) case. Besides, property 4 holds for all 
kinds of products in which it is permissible to remove brackets ac- 
cording to the ordinary arithmetical rules (e.g. for the product of 
a aia by a scalar, for the scalar or vector product of vectors and so 
on). 

15. Variance. The variance characterizes the degree of the spread 
of a random variable about its mean (expectation). Let us be given 
a random variable €. By definition, its variance (also called disper- 
sion) is the quantity 


Dg= D {£} = M {(6— MB} (24) 


This quantity is deterministic and always positive except the case 
when § itself is a deterministic quantity (in this case we have DE = 


By property 6 in Sec. 14, formula (21) implies the formulas 
DE= D(ei—E P, and Dg= | (z—Ẹ} p(x) de 


From (21), we easily see that if Ẹ is multiplied by a constant C 
then DE is multiplied by C? and that if a constant is added to & 
its variance D§ does not change. Further, if two random variables 
§ and n are independent, we have 


D E+ y}=DE+ Dy (22) 
In fact, 
D {+n} =M {E+ n—M (E+ 0) ?} =M {[(E— MẸ) + 
+ (1 —Mn))?} = M (§— ME)? + 2M {(E— M8) (n— Mn)} + 
-+M (n— Mn}? = D§ + 2M (E— M$) -M (q— Mn) + Dy = 
= D$ + 2.0.0 + Dn=DE+ Dy 


(where has the independence of the random variables Ẹ and n been 
used in the above calculations?). 
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Let us determine the dispersions for the examples considered in 
Secs. 9 and 10. The variance of each summand entering into formu- 
la (20) is equal to (0 — P} (1 —P) + 1 — P} P =P (1 — P). 
Hence, by formula (22), we obtain the expression DE = nP (1 — PY 
for the binomial law. Now passing to the limit, as n —> oo, we ob- 
tain the expression DẸ = œ for the Poisson distribution. For the 
uniform distribution over an interval a <z <b we deduce 


b 
a+b \2 m _atb\2 1 4 _ (b—a) 
pe = | (2-) p(z)dx= J (z 5 ) et ae 
(check up the result!). Finally, for the normal law we obtain 


DES j (x —a)? ee | Sa] dx 


Substituting s= ais into the integral we get 
o 


D= f s? exp (— s?) ds 


Now, putting s=u and dv =s exp(— $?) [ that is ds=du and 
v=— + exp(—s*) | we integrate by parts and thus deduce the result 


D =e f exp (—s?) ds = 0° 


Together with the variance DE, we often use the square root of 
it which is called the standard deviation of the random variable &. 
The standard deviation V Dé is of the same dimension as Ẹ. We 
see that the parameter o entering into Gaussian law (12) is nothing 
but the standard deviation of the normally distributed random 
variable §. 

Formula (22) implies an important consequence. Let random va- 
riables &, Ëz, «++ En be independent and let them be distributed 
according to the same law with the standard deviation o. Then their 
sum has the dispersion no®, and therefore its standard deviation is 
equal to V no. Now notice that the above two examples indicate 
that the values of a random variable which correspond to the higher 
probability are concentrated on an interval whose length is directly 
proportional to the standard deviation (this property will be dis- 
cussed in more detail in Sec. 19). Hence, for the sum of independent 
random summands, the length of such an interval is proportional 
to Vn (but not to n as it would be if we had n equal summands). 
In particular, this law holds for the error of the sum of several sum- 
mands which are known with the same accuracy (compare with 


Sec. J.9). 
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Two different random variables distributed according to different 
laws may have the same expectations and the same variances. There- 
fore, to obtain a more complete description of random variables, 
we also use some other numerical characteristics. In particular, we 
introduce the so-called moments of a random variable (of its pro- 
bability distribution) which are defined as 


M(e)=|z'p()de (k=1, 2,3, ...) 


under the assumption that the corresponding integrals are conver- 
gent. This is the kth moment (the moment of order k) of the variable 
=. The first moment is nothing but the expectation (mean value), 
and the variance, as it is implied by formula (21), is expressed in 
terms of the moments of the second order: 
DE = M {Ẹ} — 2M {EM$} + M {(ME)*} = M {&}— (ME)? 

The higher-order moments characterize the law of probability dis- 
tribution of a random variable more completely than the expecta- 
tion and the variance. 

16. Correlation. Let us be given the probability density p (z, y) 
of a joint distribution of two random variables Ẹ and n (see Sec. 14). 
If it is known that the variable Ẹ has assumed the value € = a we 
can speak about the conditional probability distribution of the ran- 
dom variable ņ relative to the hypothesis that & takes the value a. 
Let us denote the corresponding conditional density function of n 
relative to the hypothesis E = a as py (y | § = a)*. Then the con- 
ditional probability of y falling in an interval between y and y + dy 
provided § takes the value Ẹ = a is equal to py (y | § = a) dy, and 


| pay lë=a) dy =1 
* Translator's note. As it was shown in Sec. 11, knowing the density p (x, y) 
of the joint distribution we can find the densities of the marginal distributions 
Py (z) = fo (z, y) dy and pn (y) = \p (z, y) dz of the random variables § 
and y. Now, according to formula (6), we can write 


Pu <n<y+dy |ġ=a) = lim P@ SE Sa tde, ySn<y+ dy) _ 
dx+0 P (a <Ẹ <a+ dz) 
= lim Pit V) dr dy _ p(a, y) 


dx-+0 pş(a)dr  pş(a) dy 


which implies 


p (a, y) 
y| țẹ=a) = = 
Pr (y | E=a) Ps (@) 
The conditional probability density ps ( | ņ = b) of § relative to the hypothesis 
y =b is found in like manner: 
P(x, b) 


Pepe 
P; («| n=) Pa) 
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When the value of £ entering into the hypothesis varies the law 
of the conditional probability distribution of the variable ņ changes 
in the general case. Hence, there is a certain relationship between 
t and y but it differs from an ordinary functional relationship bet- 
ween deterministic variables which was studied in the foregoing 
chapters. 

A relationship of this kind is called a correlation. Similarly, 
if we are given a joint distribution of an arbitrary number of 
random variables we can define the correlation between any of the 
variables and the rest. 

We often encounter relationships of a correlation type. For in- 
stance, when we speak about the relationship between the weight 
of a person and his height we undoubtedly mean a correlative rela- 
tion because we know that the weight is not completely and uniquely 
specified by the height. At the same time it is quite clear that the 
law of distribution of the weights of the people two metres high 
differs from that of the people one and a half metres high. When 
we say that smoking reduces the duration of life of a person we also 
mean a correlative dependence because, although there are many 
cases of different kind, we nevertheless find that the average duration 
of life of non-smokers is higher than that of smokers if we consider 
the law of probability distribution of the duration of life. We must 
carefully distinguish between the deterministic and correlative 
dependences and also take into account that in the latter case the 
existence of contradictory examples does not affect the general 
validity of probability inferences. 

The mean value of the conditional probability distribution of 
the random variable y is a deterministic function of z (if x designates 
the numerical value assumed by the variable g which we denoted 
as — = x = a above): 


M {nlE=2}= | ypa (U lE= 2) dy 


Let us denote this function as f (x). The function f (x) (called the 
regression function of y on §) expresses the conditional mean value 
of 1 relative to the hypothesis § = 2; the graph of the function is re- 
ferred to as the regression curve for the mean of n (the regression 
line of y on &). In the above example of the duration of life of smo- 
kers, it is the regression that defines the regularities we are interes- 
ted in. We can similarly determine the conditional mean value 
@ (y) of Ẹ relative to the hypothesis that y = y and construct the 
corresponding regression curve for the mean of €. It is interesting 
that, generally speaking, the functions f (x) and ọ (y) are not inverse 
with respect to each other as it would be if we had a deterministic 
relationship. This becomes especially clear in the case of independent 
random variables § and n when both conditional means are con- 
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stant and the corresponding regression curves turn into straight 
lines parallel to the y-axis and to the z-axis, respectively. 

There is a comparatively simple special case when the regression 
of both variables £ and ņ is linear, i.e. when both regression func- 
tions f (z) and ọ (y) are linear. Let us introduce the notation 


mi—En 

VDEDH 

The quantity rz,, is called the correlation coefficient of the random 
variables £ and n. It is possible to prove that we always have | rz,» |< 
<1 and that in the case when both functions f (x) and  (y) are 
linear they have the form 


f(z) =r}, n Be-B +7 and @(y)=rgnV Tu- 


but we shall not give the proof here. It follows that if rz, n >O both 
functions are increasing and if rz,, < 0 they are decreasing. 

An important example of a linear correlation is the two-dimensio- 
nal normal law (see Sec. 11) with the density function 


Pp (z, y) = M exp [— (Az? + 2Bry + Cy?)] 
where M is the normalization factor and the quadratic form in the 


parentheses is positive-definite. We suggest that the reader prove 
that in this case we have 


R= 


B 
{== 2, ey=—3y and ten= ag 


ir fe Characteristic Functions. The characteristic function of a 
random variable Ẹ is a function of a real parameter u of the form 


Ps (u) =M {e1} (— oo <u < 00) 


Property 6 in Sec. 14 enables us to write in full the expression of 
a characteristic function in the form of a sum or of an integral: 


P; (u) = >} Preh or P; (u) = j ep, (z) dx ~ (23) 
k 


For the first time characteristic functions were systematically em- 
ployed by A. M. Lyapunov. 

The second formula (23) is nothing but the Fourier integral of 
the function q; (u) [see formula (XVII.141) in which we used another 
notation]. Hence, the probability density Pz (z) is expressed in 
terms of the characteristic function by the formula 


D; (x) = x f P; (u) e*™* du 


ELEMENTS OF THE THEORY OF PROBABILITY 749 


\Ve now enumerate some simple properties of a characteristic func- 
tion. Formula (23) shows that there must always be | ; (u) | <1 
and p; (0) = 1. If n = CyE + Ca (where C, and C, are constants) 
then 


Pn (U) =M {ef(CH8+ C3} = M elCstei Cut) m eini (Cu) 
If = Ẹ + y and the variables Ẹ and y are independent then 
Py (u) =M {eiut} = M {etuteiun} — M {elt} M {el} = qp; (u) Pr (u) 


The first formula (23) and the last property enable us to deduce 
the expression for the characteristic function of a random variable 
having a binomial distribution (see Sec. 9): 


Qe (u) = (1 — P + Pel)" 


Now passing to the limit we obtain the characteristic function of 
a random variable distributed according to the Poisson law: 


Pe (u) = exp (—a + ae") 


For the case of a uniform distribution (see Sec. 10) we obtain 
(eibu — etau) 


(= "Tea 


For our further aims it is necessary to find the Fourier transform 
of the function f (æ) = exp (—z*). Applying formula (XVII.138) 
we obtain 

f) =e | exp (—2*—ihx) dz = 
4 k? k2 
=z ap(-7) j exp[ — (z+i5) ] dz 


(Check it up!) But the last integral in fact does not depend on k. 
Indeed, denoting the integral by J (k) and differentiating we obtain 


4 je (eHh) Ja (e414) $= 
=texp[ —(2+i5)’] =0 
(why is it so?). Consequently, I (k) = I (0) = f exp (—2*) dx = 


4 1 ‘ 
= Vx. Thus, finally we get f (k) = zva xP ( = *) . From this, 


with the help of property 3 in Sec. XVII.33, we conclude that the 
Fourier transform of the function exp (—az*) (where a >0) is the 
; 2 


function na exp ( r ia) ; 


x= 


x=- 
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Now we can readily determine the characteristic function of a 
random variable distributed according to the Gaussian law (see 
Sec. 10). Let us first take the case œ = 0. Formula (23) expressing 
the Fourier inverse transform of the function pz (x), it is sufficient 
to multiply our result by 2x which yields 


Ou) = 28s T 22 exp (— 20") =exp(— 2e ) 


To investigate the general case when a = 0 we can add œ to the 
above random variable for which the characteristic function has 
just been computed. Then denoting the new variable by the same 
letter & and taking advantage of the properties of characteristic 
functions enumerated above we finally obtain 


eget en ($F) =o (E) 


In particular, this result implies a remarkable consequence. Let 
E, and & be two independent random variables distributed accor- 
ding to the normal law with the parameters a1, % and Og, Oz, Tes- 
pectively. Then, for the variable Ẹ = & + &=, we obtain 


2552 27,2 
oe (2) =e Oet = ex (dau — 28") exp (iors —“S-) = 
= exp [i (a4 +22) u— | 


Thus, we have again arrived at the normal law with the parameters 
a =a +a, and o= V 0} + 63. The invariance of the normal 
law with respect to the addition of random variables is one of the 
basic properties of the law which accounts for its being so widely 
spread. Among the probability distributions of discrete random 
variables, the Poisson law possesses this property. 


§ 4. Applications of the Normal Law 


18. The Normal Law as the Limiting One. We now investigate 
the behaviour of the binomial law (see Sec. 9) when P remains con- 
stant and n—> oo. A random variable &” distributed according 
to the binomial law has the mean an= nP and the standard deviation 
on = Vn VP (4 — P) (see Secs. 14 and 15) and hence we have 
an —> œ and On — œ for n—> oo. Thus, we see that §” “spreads” 
over the whole z-axis in the limit. This makes it difficult to inve- 
stigate the behaviour of & directly. It is therefore convenient. 
to perform a linear transformation of the variable &™ so that the 
mean value should become equal to zero and the standard deviation 
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become equal to unity after the transformation has been carried 
out. A transformation of this kind is called the standardization (or 
normalization) of the variable Ẹ™, and it is expressed by the follow- 
ing simple formula: 


1 n 
n= ree (E™ — an) 


The variable ņn™ is known as the standardized (normalized) variable 
corresponding to the random variable &. 

There is a remarkable theorem referred to as the De Moivre- 
Laplace theorem which states that the law of distribution of the 
above standardized random variable tends to the normal law when 
n— oo. (P. Laplace, 1749-1827, a famous French astronomer, 
physicist and mathematician.) 

The theorem is proved as follows. By Sec. 17, we have 


san u\n 
ym (u) =e a(i p+ pean) = 


nP n 


-exp (—i/ zp u) [1—P+Pexp (7a) | F, 
= {exp (—i V zamr") [1-2 +P exp (vam) 


Expanding the expression in the curly brackets in powers of 7 
n 


we obtain 


> È Pu 
ou (w= {(1-tV wary "dF t +) x 


= {1-5 i "+ exp(—F) 


(check up the calculations!). Thus we have arrived at the characte- 
ristic function of the normal law (see Sec. 16) with the parameters 
a=0 and o= i 

In Sec. 14 we mentioned that a variable distributed according 
to the binomial law is the sum of n independent random summands 
with the same simplest law of probability distribution. But it turns 
out that the normal law is obtained in the limit for any initial law 
of distribution (of course, except a deterministic law). 

Indeed, for the sake of simplicity, let us suppose that we have 
an initial law of distribution with the characteristic function @o (u) 
and with the zero mean value, the variance being equal to unity. 
These limitations are inessential because in the general case we carry 
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out the normalization. Then formula (23) implies gj (0) = 0 and 
q; (0) = —1 ete., and hence, on the basis of Taylor’s formula, we 


have @o (u) = 1 — x + .... Using the notation similar to the 
above we find 


Pron (u) =| o0 (3e)T=['- at: ; Aere exp(—4>) 


It turns out that the condition that the laws of distribution of 
the summands should be the same is also inessential. For instance, 
A. M. Lyapunoy proved that the law of distribution for the standar- 
dized sum of independent random summands i, Ëz ---» Ën is 
also close to the Gaussian law when n is large if the ratio 

3 


2 M | Er — ak p : (> Des}? (ar = Mg) 


is small. This condition is violated if the variance of a small number 
of summands considerably exceeds the variance of the rest. In this 
case the latter summands do not contribute to the whole result, in 
the limit, after the standardization has been performed. Lyapunov’s 
condition is also violated in some other special cases, for instance, 
in the case leading to the Poisson law (check up this assertion!). 

If the standardization results in a normal distribution we obvious- 
ly have a normally distributed variable before the standardization 
put with an arbitrary mean value and variance. Hence, we can say 
that the sum of many independent random summands is normally 
distributed irrespective of the laws of distribution of the summands. 
The exceptions to the rule are the cases enumerated in the fore- 
going paragraph. Here lies the main cause making the Gaussian 
law so important. 

In particular, it is usually assumed that the random errors of a 
measurement obey the normal law. Actually, as a rule, such an error 
results from mutual superposition of a great many small indepen- 
dent errors which cannot be taken into account separately. It is 
this fact that leads to the assumption that the Gaussian law is appli- 
cable here. 

19. Confidence Interval. We now come back to the problem of 
tossing a coin which was considered in Sec. 1. It is clear that if the 
coin comes up heads 200 times in 1000 tosses we have every reason 
to suspect that something is wrong. Shall we say the same if we 
have 400 heads or 450 heads? In other words, shall we consider it 
to be unusual if the relative frequency of the coin coming up heads 
is 0.4 or 0.45? Now we are able to answer a question of this type. 

Let us take a more general situation. Consider a random variable 
¢ (in the above example the role of Ẹ was played by the number of 
occurrences of heads in one toss). Suppose that n trials have been 
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performed, and let € assume the values 2, £2, ..., £n in these 
trials. Let us designate the arithmetic mean value of the quanti- 
ties. Xj, £a -3 aie Spee 


rr = bs (a+ 2+ ...+ 2n) (24) 


On the basis of Sec. 13, we can assert that a” — Ẹ for n —> oo. This 
is the so-called law of large numbers (in our course we have taken 
the law of large numbers as the foundation of the definition of the 


mean Ẹ ofa random quantity §). But what is the rate at which x” 


approaches §, as n —> oo? 
Let us consider the random variable 


eal EHF wee + En) 


where all the summands are independent, each &; (i = 1, 2,..., n) 
being distributed according to the same law of probability distri- 
bution as & Then quantity (24) is one of the possible values of the 
variable &™. But, on the basis of Sec. 18, we can regard the random 
variable Ẹ™® as being distributed according to the Gaussian law 
for large n. Besides, we have 


—1@+8+...+8)=8 
and i i i 

DE” =- DE = — D= 7 É 
where ø is the standard deviation of the variable Ẹ. Therefore, 
by formula (14), we can put 

1 — mes 
Pla<i<vj=z[0 (Vn )-o( nae )] 

For the sake of simplicity, let us restrict ourselves to the case of 


symmetric intervals of the form | §&” — E | <6. Then we obtain 


pq» =o (214) (25) 


OE 


b—E 
Og 


It is formula (25) that provides the answer to the problem stated 
at the beginning of this section. In practical applications it can be 
used for any values of n. Here we give a rough table of the values of 


the function © (t): 


0.4 


EANO U 


48—O441 
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4.9 |2.0 |2.4 | 2.2 | 2.3 |2.4 


423°) 4.4 [1s 


1.6 e7 jis 


o.o o.ar] 0.94 0.98 


@ (t) |0.77| 0.84 0.84 TERE 


0.0.0 


For large values of ¢ the corresponding values of © (é) can be found 
with a great accuracy by means of the asymptotically convergent 
series (see Sec. XVII.19): 


ios 12 
@(t)= aya Bt (+5) =!- Ze 2Y (i) 
where 


1 1 1-3 1-3-5 


Now let us return to the problem of tossing the coin. Suppose we 
agree that an event whose probability is less than 0.04 will be con- 
sidered to be highly improbable. Then choosing 5 so that the right- 
hand side of (25) should become equal to 0.99 we obtain the corres- 
ponding interval which includes all those values which we regard 


as probable. In our case we obtain gvn = 2.6. We have n = 1000 


and o; = 0.5, and hence ô = 0.041. Let us denote the number of 
occurrences of heads by M. Then we get a confidence interval 459 < 
<M <541 for M. Thus we have obtained confidence limits for 
the number M which we can guarantee disregarding those events 
which are considered to be practically impossible relative to the 
criterion we have chosen (see Sec. TOH 

20. Data Processing. In Sec. 19 we considered a random variable 
& whose law of probability distribution was regarded as being known. 
But in practical problems we most often encounter cases when. the 
law of probability distribution is not known beforehand and when 
the numerical characteristics of the random variable & under inve- 
stigation should be determined on the basis of the outcomes of the 
trials. For instance, this is the case when it is necessary to find the 
mean value of a certain parameter characterizing some property 
(such as strength or longevity and the like) of a manufactured article 
belonging to a lot (general population) on the basis of inspecting 
a sample of a number of articles randomly drawn from the lot. 
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The same problem arises when we repeatedly measure some un- 
known physical quantity because the result of every measurement 
is a random quantity due to the unavoidable random errors. When 
there are no systematic errors the arithmetic mean of the experi- 
mental data is approximately taken as the precise value of the 
quantity in question and so on. 

Suppose that we have performed n trials and that a random variable 
E has taken the corresponding numerical values 2, v2, ..-, Tn- 
Then our previous investigation indicates that it is the arithmetic 
mean value z™ of the results we have obtained [defined by formu- 
la (24)] that should be taken as an approximate value of Ẹ. Besides, 
it turns out that an approximate value of the standard deviation 
o; can be chosen as 


GRA ; 
o% n p (zi: — 2)? (26) 
i=i 


To prove (26) we use the notation introduced in Sec. 19 and addi- 


ee ; 
tionally put n = 7-7 X (& — &™)?. Computing the mean value 
=1 


of n we find 


M {Gn} = 4M {3 nas (38)'} 


(check up the result!). Now, representing each &; in the form &; = 
= (t: — Ẹ) + Ẹ and calculating we obtain 


= [ »DE+ ne? — a (nDE + ne) | = (DE =e 


Besides, computing Dg, we find that DE, > 0 when n—> œ. (Let 
the reader perform the calculations.) But the radicand in (26) is 
nothing but a value of the random variable ĉn. This implies (26). 
Now we can set a criterion according to which random events will 
be considered to be practically impossible (see Sec. 7) and then, 
i rmula (25), construct a confidence interval for §. For 
ees Eo AR: the events whose probability is less than 


0.003. Then (25) indicates that we can put sys = 3. Thus, we get 
T? $ 
the following confidence interval for §: 
OESE. <ét ep 4 (27) 


The confidence limits thus obtained are guaranteed with a probabi- 
lity of 0.997. Formula (27) is widely applied to practical problems. 
The value of o; entering into (27) can be taken from formula (26). 


48* 
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Let us discuss a consequence of formula (27). Suppose that we 
have performed two independent series of n trials for one and the 
same random variable € and that this has resulted in the mean 
values 2™ and x. Then estimation (27) holds for both values with 

60, 


the probability (0.997)? = 0.994, and hence [2 —2™ |< Va 

n 
with this probability. This means that the empirical means 2” 
assume some specified values with a great probability in any series 
‘of trials when the number of trials n is sufficiently large. This result 
is quite clear in its qualitative aspect because it is obviously im- 
plied by the definition of the mean. 

Let us be given two correlated random variables € and % and let 
the regression of both variables be linear (see Sec. 16). If n trials 
are performed we obtain n pairs of the values assumed by the variab- 
les: (a4, ys); (£2 Ya) «= +2. (Tm Yn). Then we can pose the problem 
of approximating the regression line of y on &, on the basis of the 
data. It turns out that the solution of this problem directly leads 
to the method of least squares described in Sec. XII.8. 

The theory of probability was originated in the 17th century in 
connection with investigating regularities of games of chance. It 
was thoroughly developed in the 19th and 20th centuries and is now 
an important branch of mathematics which has many applications 
in various divisions of science. Many prominent mathematicians 
took part in creating the theory of probability. Among Russian 
‘scientists who contributed much to the theory we can mention 
P. L. Chebyshev, A. M. Lyapunov, A. A. Markov, S. N. Bernstein 
(4880-1968), A. N. Kolmogorov, Yu. V. Linnik and others. There 
is a mathematical science called mathematical statistics which is 
directly related to the theory of probability. Mathematical statis- 
tics deals with problems of processing statistical data. There are 
many courses on the theory of probability, mathematical statistics 
and their applications. For a beginner we recommend (41, 147], 
[41], [45] and [48]. 


CHAPTER XIX 


s o O See 


Computers 


The simplest computing devices such as a slide rule, an abacus, 
an arithmometer and tables are well known and widely applied to 
practical problems. But these devices do not meet the requirements 
of modern science, engineering and economics. There are many im- 
portant problems which are solvable, in principle, and whose solu- 
tion cannot be practically obtained by means of the above devices 
because this would take too much time. Therefore in applying these 
simplest tools we usually disregard many essential factors in order 
to simplify the calculations, and this often leads to quantitative 
and even to essential qualitative errors. 

The need for more effective computing devices and the achieve- 
ments of modern technology have led to a revolution in data hand- 
ling and problem solving practice. This has resulted in inventing 
and constructing high-speed electronic computers. The intensive 
application of these machines has made it possible to solve many 
important problems and to obtain fruitful results by means of 
introducing mathematical methods in a number of new fields of 
human activities. There is no doubt that the development of modern 
calculating devices and the extension of their application will result 
in radical reorganization of scientific research, engineering work, 
economics, control, service and so on. 


§ 1. Two Classes of Computers 


There are two basic methods of representing mathematical quan- 
tities entering into calculations. The first method is based on a direct 
representation of mathematical quantities by some physical quan- 
tities such as lengths, angles, voltages and the like. To perform 
certain operations on these quantities it is necessary to construct 
a physical system in which the corresponding physical quantities 
are transformed according to the law which describes the transforma- 
tion of the mathematical quantities in question. Computers based 
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on this principle are called analogue or simulating computers. The 
slide rule and the planimeter (see Sec. XIV.12) are the simplest 
examples of an analogue calculating device. The second method 
is based on the use of a certain device which akes it possible to 
represent the mathematical quantities under consideration in a di- 
gital form. The transformation of these quantities is then reduced 
to arithmetic operations on digits. The computing machines of this 
class are called digital computers. In particular, the abacus and the 
arithmometer belong to this type. The above-mentioned achieve- 
ments in applying computers are basically connected with digital 
computers. 

1. Analogue Computers. We begin with a simple example. Let 
it be necessary to find the sum s of two given quantities x and y: 


s=ar+y (1) 


The problem can be simulated by means of a mechanical scheme 
(for instance, see Fig. 346a) representing x, y and s as lengths or by 
means of an electric circuit with two rheostats (shown in Fig. 346b) 
where z, y and sare represented 
as the corresponding electric cur- 
rents. Many other devices can 
also be used for this purpose. 
This simple example clearly 
illustrates the characteristic fea- 
tures of analogue calculating 

devices. 

‘IH! x » _ First of all, the input variables 

IH - (input parameters) z and y can be 

continuously varied within cer- 

b tain limits. Of course, we can 

® y imagine that the . resistances 

Fig. 346 shown in Fig. 346b are varied in 

: a discrete way by means of a set 

of resistors. But such a discreteness would be constructional 
whereas the digital computer operations are essentially discrete. 

Further, it is clear that the accuracy of the values of input para- 
meters and that of the result which can be guaranteed in these devi- 
ces is not high. Usually it.is of the order of several per cent or, at 
best, of several tenths of per cent. This apparently limits the possi- 
bility of simulating complicated calculations. Moreover, analogue 
computers are usually special purpose computers, that is suited for 
solving problems of a certain specific class. For instance, the devices 
represented in Fig. 346 are intended only for performing operation 
(1) and some other operations directly related to it (for example, 
subtraction). But it should be noted that when’ we have to solve 
many similar special problems and when the required accuracy is 
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not high the use of analogue devices and computers proves to be 
very effective. For instance, electronic integrators are widely used 
for solving systems of ordinary differential equations. 

The above example also clearly indicates that one and the same 
functional relationship between the quantities in question can be 
realized by means of different physical schemes. This feature is also 
common to a great number of more important and complicated pro- 
blems. It creates the foundation of simulating physical processes. 
Suppose that it is necessary to determine the numerical value of 


Py x 
aE 
z=x+y Z=xy Z=f(x) 
(a) (b) (c) 


sre 
' ake zat 
(d) (e) 
Fig. 347 


a quantity S entering into a physical system and that it is difficult 
to measure or calculate the quantity directly. Then we can try to 
design another system of different physical nature in which the 
quantities involved are connected by the same functional relation- 
ship. If such a system is constructed we simply measure the quan- 

i the new system. Besides, when there is 


such a mathematical equivalence of two physical systems we often 


some other purposes. In recent years methods based on simulating 
conditions of a problem by means of electromechanical, optical, 
electro-diffusive and other processes have been widely spread. 

An analogue computer is often constructed as an aggregate con- 
sisting of several units (blocks, components) each of which is capable 
of performing only one operation. We now consider the most com- 
monly used methods of representing operations on quantities by 
means of voltages. Fig. 347a shows an adder with two input ter- 
minals and one output terminal. If some constant (or dependent 
on time ż) voltages 7 and y are applied to the inputs the voltage at 
the output terminal will be equal to z = z + y. When considering 
such a diagram we are not interested in the specific physical proces- 
ses involved. We similarly represent a multiplier (Fig. 347b), a 
unit performing a functional transformation for a certain function 
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f which is shown in Fig. 347c (for instance, such a unit can perform 
the transformation f (z) = sin z if the sine is taken as f and the like), 
an amplifier (Fig. 347d) in which the amplification factor k can be 
varied etc. In solving differential equations we use a differentiator 


(see Fig. 347e) which produces the output voltage a for two time- 
dependent voltages z (¢) and y (t) applied to the inputs (in the general 
case the output a also depends on ż). There are many other units 


which are used in constructing an analogue computer. Arranging 
these blocks in different combinations we obtain aggregates capable 
of solving various equations. 

For example, let us consider a scheme intended for solving a sys- 
tem of equations of the form 


ge rr! 
dz + f(y) = 0 


It is more convenient to rewrite the system as 


The corresponding diagram is represented in Fig. 348. The circle 
represents a voltage source with one of its terminals earthed, and 


Fig. 348 


the voltages transmitted by the channels are also put down there. 
The scale of the amplification factor can be calibrated in inverse 


quantities and then there is no need to compute*the values of $ 


1 
and Te 
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We see that this scheme is convenient for studying the way in 
which the variations of the coefficients affect the solution. Modify- 
ing this scheme we can solve many other similar problems. But it 
should be noted that in case the problem in question has several 
solutions this scheme may yield a solution different from the one 
we are interested in. 

Let us take one more example illustrating the integration of a 
differential equation. For instance, let us take an equation of the form 


yt+fy)ty=v) Y=y) 


The corresponding scheme is shown in Fig. 349. Let the reader verify 
the correctness of the scheme [in doing this it is more convenient 


x Output 


to rewrite the equation in the form y =‘) (x) — y" — f (y’)]. Here 
the quantity y is a variable whose variations should be related to the 
variations of the independent variable z. : i à 

We often choose the time ¢ as an independent variable. For this 
case the differentiator (see Fig. 347e) has only one input to which 
the voltage tp (t) is applied. The above scheme for solving the diffe- 
rential equation should be then changed correspondingly. Namely, 
instead of putting in the voltage z we must simply apply the voltage 
ap (¢) to the terminal which is represented by the point in Fig. 349. 
This can be performed without calculating ap (é) if we read this 
signal from a graph or from some data unit. Such a scheme is par- 
ticularly convenient when the physical meaning of the problem in- 
dicates that the independent variable is the time ¢. Then the problem 
can be solved in real time. The solution can be directly transmitted 
(without human intervention) to some other system for a further 
utilization. Many devices intended for automatic control are based 
on this idea. I : 

The real time simulation makes it possible to replace some expen- 
gates by computers when testing a complicated device. 


si Te evice 
ae ae testing an autopilot in a flight which is 


- For instance, instead of 
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dangerous and expensive we can do it in a test stand where an ana- 
logue computer substitutes for a real airplane. It is apparent that 
this computer must precisely simulate the reaction of the airplane 
to the performances of the autopilot. 

The fundamentals of the theory of analogue computers can be 
found in [42] and [80]. 

2. Digital Computers. The first devices of the type of an abacus 
were invented in China as early as 2,000 or 3,000 years B.C. The 
first mechanical computer capable of performing addition and sub- 
traction was constructed in 1642 by the prominent French mathema- 
tician, physicist and philosopher B. Pascal (1623-1662). The modern 
desk calculators (summing machines capable of performing addition 
and subtraction and semiautomatic and automatic arithmometers 
which perform all the four fundamental operations of arithmetics) 
are essentially the perfection of Pascal’s calculator. All these use- 
ful devices cannot work very fast because the speed of performing 
calculations is low and because the input data are entered into such 
a machine by an operator. 

At the beginning of the 20th century the necessity for processing 
a large number of similar data in statistics, book-keeping and finan- 
ces led to the creation of punch card computers (tabulators, sorters 
etc.). The input data to be inserted in such machines are punched 
on cards (punch cards; see Sec. 4). The punch card reader reads the 
cards by means of a set of brushes (which sense the holes punched in 
the cards) and produces the corresponding electric pulses. These 
electric signals make the machine work according to the given pro- 
gram. It can sort the cards, sum up the parameters, accumulate 
the results and perform some other simple operations. These ma- 
chines are very useful but they cannot be applied to more complicated 
calculations. 

As it has been mentioned, the revolution in this field is connected 
with the appearance of a new class of calculating machines which 
are referred to as universal high-speed digital automatic computers. 
Although the idea of constructing such machines appeared as early 
as the 19th century, it is only the achievements of modern electro- 
nics that have made it possible to realize the idea. The first high- 
speed electronic digital computer ENIAC (Electronic Numerical 
Integrator and Computer) was constructed in the USA in 1943. The 
basic theoretical ideas and principles of constructing such machines 
were formulated by the American mathematician J. von Neumann 
(1903-1957) in 1946. In the USSR the first machines of this type 
were manufactured in 1952. 

The block diagram of an automatic digital computer is shown in 
Fig. 350 which represents the main units and their interconnection. 
In the memory (store or storage) there are a number of locations 
(cells or compartments) each of which can store one number or one 


* 
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instruction (order or command). As we shall see in Sec. 5, the form 
in which an instruction is represented within the machine does not 
differ from that of a number. Some of these locations store the in- 
formation obtained from the input device before the calculations 
are started whereas the others can be filled in the process of work. 
When the machine performs calculations the contents of the loca- 
tions may change many times, and some of the locations may remain 
empty and may not be used. The control unit interprets the instruc- 
tions given to the machine and sends the numbers (or instructions) 
into the arithmetic unit. The arithmetic unit transforms the numbers 
according to the instructions and then returns them to the memory. 
As the required results are obtained, 
they are automatically printed accor- 
ding to the signals sent by the control 
unit, and after the computations have 
been completed the control unit stops 
the computer. (In § 2 we shall describe 
the sequence of the operations in grea- 
ter detail.) } 
Any mathematical problem can be Fig. 350 

solved by means of a universal auto- rece tanne and. the dotted 
matic digital computer provided that lines represent control channels 

a certain algorithm is given. An algo- 

rithm is understood as an accurate assignment defining a step-by- 
step procedure for solving the problem which necessarily leads 
to the required result on the basis of the input data. But there 
are also many non-mathematical problems for which a similar algo- 
rithm can be indicated, and such problems are solvable by means 
of a digital computer. For example, an electronic’ computer can 
control the process of machining a workpiece of a complicated pro- 
file in a metal-cutting lathe. In this case the input data characterize 
the profile, and the results of the calculations are transformed into 
signals controlling the lathe. The device controlling the flight of 
an airplane beginning with the take-off and up to the landing at 
a prescribed place operates in a similar way. A digital computer 
can control a manufacturing process and makes it possible to com- 
pletely automatize the process which is particularly important in 
those industries which are hazardous to human health. If the per- 
formance of the process deviates from the initial program the com- 
puter can find the most advantageous solution by comparing diffe- 
rent variants, and it can also check the result. Of course, the neces- 
sary information concerning the real state of the process should be 
entered in such a machine automatically by means of special devices. 
A computer can help in designing an engineering construction be- 
cause it can examine hundreds of variants and choose the best of them 
by applying a certain given criterion. Digital computers are also 


| 
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successfully applied to the problems of weather forecast, to the 
transportation problem etc. But it should be noted that in many 
such applications it often turns out that special purpose machines 
intended for solving some specific problems are more effective than 
the universal (general purpose) computers. 

The introduction of computers facilitates the automatization of 
many forms of human mental activities. A computer can realize 
an algorithm worked out by a human being and even develop new 
algorithms in the process of the realization. 


§ 2. Programming 


3. Number Systems. The decimal number system which is stu- 
died at school and which we use in everyday life is inconvenient 
for the work of electronic digital computers. For these purposes the 
binary number system proves to be much more convenient. The 
binary system uses only two digits, 0 and 1, whereas there are ten 
digits in the decimal system (i.e. 0, 1, 2, 3, A. 5. 07,8 and 9) 
The binary system is based on the convention that combinations of 
the form 10, 100 etc. designate the corresponding powers of two but 
not of ten, i.e. 10 designates the number two, 100 designates the 
number four and so on. 

The table below illustrates the decimal-to-binary conversion of 
natural numbers: 


ELLE 


10} 44 J400 oi 


Decimal form 6 44 | 42 | 43 | ete, 


if] |. 


1004 1400}1101] ete. 


Binary form | 4 


101001 


uso 


_ Any integer can be written in the binary form. To represent an 
integer as a binary number we must isolate from it powers of two, 
in succession, beginning with the highest power. For instance: 


1972 = 1.1024 + 1-512 + 1-256 + 1-128 + 0-64 + 1-32+ 
+4:4640-8+4-410-240-4 


and therefore the binary f h i e 4972 is 
priser dre nary form of the decimal number is 


We similarly convert decimal fractions to binary ones. For in- 
stance, the binary number 10.1011 means 2 + + ++ = is = = 
in the decimal system. Any fractional number can be written in 
the form of a finite or infinite binary fraction; of course, infinite 
binary fractions, like decimal ones, are rounded in practical com- 
putations. 
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The addition and multiplication tables in the binary number 
system are extremely simple: 
0+0=0, 1+0=0+1=1, 4+41 = 10, 
0-0 = 0, 0-4=1-0=0, bi == 4 


Applying these tables we can perform arithmetic operations on num- 
bers written in the binary form in the same way as we perform them 
on decimal numbers. 

The main disadvantage of the binary system is that the binary 
representation of a number requires much space even when the 
number is comparatively small. 

Therefore other systems for representing numbers are also used. 
The octal (octanary) number system which uses eight digits 
(0, 1, 2,.3, 4, 5, 6 and 7) is of particular importance. The number 
“eight” is put down as 10 in this system, the number “nine” has 
the form 14, the number “sixty-four” is represented as 100 etc. We 
can easily convert numbers from octal to decimal system and vice 
versa. For instance, the octal number 574 has the decimal form 

5.82 +17.8 +1 = 37 

The application of the octal system is accounted for by the fact 
that the length of the representation of a number in this system is 
not much greater than that in the decimal system but at the same 
time it is very simple to convert an octal number to binary and vice 
versa. For instance, the numbers 5. 7 and 4 have the binary form 
101, 414 and 001, respectively, and therefore the octal number 574 
is put down as 101 441 001 in the binary system because the passage 
from one octanary place to the next one corresponds to the multi- 
plication by 1000 in the binary system. Thus, to convert from binary 
to octal it is necessary to group the binary digits in groups of three’s 
(from the binary point) and write down the decimal value of each 
group taken as an integer. In particular, the locations of the memory 
of a computer are usually numbered in the octal system. For instance, 
if there are 512 (this is the decimal one hundred twelve) locations 
then they receive the octal numbers ranging from 0 to 777 (check 
it up!). 

an entering numbers into a computer we also use the so-called 
binary-decimal representation. For this purpose each decimal digit 
is coded in the binary system (four-digit binary code). The first 
ten decimal numbers (beginning with zero) are represented in this 


code as in the binary system: 
of eeel. 
| Binary-decimal sin] [| ca ooo orte 2o] 1001 
code 


aie, EA AA A a 


| Decimal code 
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To represent the subsequent natural numbers we write the corres- 
ponding quadruple of the binary digits 0 and 4 in place of each deci- 
mal digit. For example, the decimal numbers 63 and 125 are written as 


01100011 and 0001 0010 0101 


respectively, in this code. Decimal fractions are similarly converted 
to binary-decimal fractions. 

The binary-decimal representation of a number occupies still 
more space than the binary one but it is very convenient because 
it is easy to pass to the decimal representation from it and vice versa. 

4, Representing Numbers in a Computer. In an electronic digital 
computer numbers are represented in the binary system and stored 
in the memory unit, storage (see Sec. 2). Each location of the storage 
in which a number can be stored contains one and the same number 
of binary places each of which can carry either 0 or 1. There are two 
main methods of writing numbers in different constructions of com- 
puters. 

1. Fixed point method. In this method the first digit specifies the 
sign of a number; conventionally, + is represented as 0 and — as 1. 
The subsequent digits are the binary digits (standing to the right 
of the binary point) of the binary representation of the number. For 
example, if there are 30 digits the representation 


1}0}0]4 1]0]0 ojo 0| 0| 0 o (23 


oft] 


‘Jojolo 


ofo 


io 


ofo 


do 


o|o 


Coni ponds to the binary number —0.0010111001 which is expres- 
sed as 


4 4 
-(5tata+ mtoa) = — 45 = —0.1806640625 
in the decimal system. 

In such a location (of 30 places) it is possible to write the numbers’ 
ranging from —(1 — 29) to +(1 — 2-9) with the interval 272 
between the numbers (why?). Therefore, when putting the necessary 
quantities into the machine we must supply the numbers which fall 
out of the range with the corresponding scale factors so that after 
the multiplication by the factors the numbers should lie in the in- 
terval from —1 to +4. 

: It should be noted that there are some other methods of represen- 
ting negative numbers but we shall not treat them here. 

2. Floating point method. This method considerably extends 
the range of the numbers which can be put into the memory of a 
machine. It is based on the additional multiplication of a number 
by a power of two. A certain fixed number of digits is then used for 
representing the exponent and its sign. Suppose these places are 
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chosen at the end of each location and there are six such digits. 
Then the maximal number which can be stored in a location (having 
30 digits) is represented in the following form: 


the 


the ae the binary digits of the number to the ta Sign arate UE 

ain ber right of the binary point exponent “the 
exponent 

} r M } coor n 


Hence, this number is equal to (1 — 2-23) 231 — 281 — 28 x 251. 
Accordingly, the minimal positive number which can be represented 
by this method is stored in the form 


ofofolo|o o|! 


and thus it is equal to 9-239-31 — 2-54 (remember that the first 
method gave us the least positive number 2729). 

It sometimes occurs that the number of digits in a location is 
insufficient for guaranteeing the accuracy needed. In such cases it 
is possible to use a special program which takes two consecutive 
locations (the first and the second, the third and the fourth etc.) 
for each number (this is called the double precision method). 

The input operation, that is “writing” numbers, is realized in 
different ways depending on the type of computer. The numbers to 
be introduced into the input unit of a computer (see Sec. 2) are pun- 
ched on a special tape (punch tape) or on a deck of special cards 
(punch cards). In the first case to each location there corresponds 
a certain part of the tape and in the second case a certain row on a 
card. The length of a row corresponds to the number of digits con- 
tained in a location; a hole punched in the card corresponds to the 
digit 1 and an unpunched place corresponds to the digit 0. For 
instance, if the numbers (2), (3) and (4) (we mean the decimal form 
of the numbers here) follow in succession in the input program the 
corresponding punched card contains a part of the form shown in 
Fig. 351 (check it up!). The holes are punched by means of a special 
perforator (which is not directly connected with the machine) before 
the machine is started. 

A programmer who composes the program (routine) for solving 
a problem often writes the instructions (see Sec. 5) in the octal 
system and the input numbers in the decimal system. Then after 
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the program is punched on a keyboard of the perforator (which re- 
sembles a cash register used in shops) the instructions are punched 
on cards in the binary form, according to the program, and the input 
data are punched in the binary-decimal code. The conversion of 
numbers to the binary system is carried out according to the instruc- 
tions written by the programmer in the program, the operation being 
performed within the machine at the beginning of its work by means 
of a special subprogram (subroutine) furnished into the machine. 

Punch cards are in some respects more convenient than a pun- 
ch tape because a program can be very extensive and can occupy 
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Fig. 354 
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a deck of cards or a reel of tape, and it is much easier to replace 
several cards in the deck in case some mistakes have been found 
than to insert the corrections into the tape. 

The representation of numbers stored in the memory of a computer 
(and transformed in the process of calculations) is realized in the 
machine by means of two state elements. Each location is a set of 
such elements, the number of the elements being equal to the number 
of binary digits in the location. A feature of these elements is that 
each of them can be in one of the two states. One of these states is 
regarded as a physical embodiment of 0 and the other as a physical 
embodiment of 1, and thus each binary place in a location can carry 
either 0 or 1. The construction of the elements varies with the model 
of a computer but in all cases they must be inertialess (i.e. they 
must pass from one state to the other in a trice) and stable (i.e. they 
must indefinitely long remain in each of the states unless there is 
a transition signal). An element can be realized as a trigger consis- 
ting of two triodes whose cathodes are connected together and whose 
anodes and grids are connected crosswise. In such a circuit either 
one triode or the other is always cut off, and the transition from 
one state into the other is performed under the action of a voltage 
pulse. Charged and uncharged parts of a dielectric plate can also 
serve as the elements, the transition from one state to the other 
being performed by means of an electron-beam tube similar to that 
used in a television set. In some computers magnetic drums are 
used as storage devices. Such a drum rapidly revolves under a mag- 
netizing head similar to the one used in a tape recorder. The elements 
in such a device are magnetized and unmagnetized portions on the 
surface of a drum. Ferrite cores (magnetic cores) which can change 


LCOMPUTERS 769 


the direction of their magnetization when a pulse of current passes 
through the winding and some other devices are also used as such 
elements. 

A number can be transferred from a location of the storage of a 
computer to the arithmetic unit and in the opposite direction along 
a single channel with each digit occurring serially in time sequence, 
or more than one digit is simultaneously transferred in parallel by 
means of a system of channels whose number can equal the number 
of digits in the location. Accordingly, the computers of the first 
type are referred to as being serial in storage access and those of the 
second type are said to have a parallel storage system. The latter 
method increases the speed of performing operations but at the same 
time it complicates the construction of a computer. 

The greater the number of locations in the storage, the greater 
the amount of information that can be entered into the machine. 
Therefore, besides a high-speed internal memory, a computer is 
usually equipped with an external memory in order to increase 
storage capacity. An external memory is usually realized as a magne- 
tic tape from which numbers stored there can be transferred, by 
means of a special device, to the internal memory. A magnetic tape 
can carry millions of locations but the speed of recording informa- 
tion on the tape and reading it from the tape is less than that of the 
internal memory because passing the tape under the magnetizing 
head requires additional time. 

The results of calculations are converted to the decimal system 
by means of a special subroutine and then printed on paper tape 
or stored (in the external memory) to be used for further calcula- 
tions. . 

Besides handling numerical data a machine can also process some 
other types of information (for instance, expressed in words) provi- 
ded it has been coded beforehand by means of binary numbers. It 
is possible to compile a program which makes a special device decode 
the results after the operation has been completed, i.e. convert the 
output data to the form which is natural for this particular type 
of information. For instance, this is the case iy Aaghipe Praglation 
from one language into another. f 
5. Instructions. A machine performs ope 
in the memory only according to instructi 
into the memory before the computation À 
stored in the binary form not differing fr f g 
are one-address, two-address and three-add \ 5 i ka ; 
used depending on the type of the mach y An- 
‘structions are especially convenient for pro @yming, : fall 
consider them here. C7, y 

A location storing a three-address instructidmagane be 

as being divided into four parts of specified length: 
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let these parts contain 3, 9, 9 and 9 binary digits, respectively (as 
above, we consider locations with 30 digits). The contents of these 
parts are the following: 

(1) The first part termed the operation part contains the code of 
the operation and tells the machine what to do. 

(2) The second part carries the address (i.e. the number of the loca- 
tion) of the first number taking part in the operation. 

(3) The third part contains the address of the second number in- 
volved in the operation. 

(4) The fourth part stores the address of the location to which the 
result of the operation must be transferred. 

All the locations are numbered in the natural order. If the memory 
contains 512 locations nine binary digits are sufficient to write the 
addresses (why?). Suppose that the number 1 is the code of the ope- 
ration of addition. Then if it is necessary to add together the numbers 
stored in the 417th and 73rd locations and to transfer the result 
to the 646th location the corresponding instruction must have the 


form 
004 100001441 000111011 110100110 
See eee 
parts: íst 2nd ee ord. 4th (5) 


The reader should verify this form taking into account the above 
division of locations carrying instructions and the fact that the 
numbers of locations are coded in the octal system (that is the above 
numbers 447, 73 and 646 are octal). After the instruction has been 
executed, the contents of all the locations of the memory (including 
the 417th and the 73rd locations) remain the same as before except 
the 646th location in which its former contents are replaced by the 
sum of the numbers contained in the locations with the addresses 
417 and 73. 

Instruction (5) is punched on a punch tape or on a punch card 
and then read in the input unit and transferred to one of the loca- 
tions in the memory where it is stored in the same way as numbers. 
The distinction between numbers and instructions only appears 
in the process of operation of the machine because if the program 
is. compiled correctly the control unit must receive as instructions 
only those signals which are sent from the locations in which the 
instructions have been placed. 

A one-address instruction is placed in a location divided into 
two parts which contain the code of an operation and the address 
of the location. To perform the above operation we need three one- 
address instructions: the first instruction sends the number front 
the 417th location into the adder, the second makes the adder sum 
up the former number and the number stored in the 73rd location 
and the execution of the last instruction results in transferring the 
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outcome of the addition to the 646th location. A two-address in- 
struction consists of three parts which carry the code of an operation 
and the addresses of the numbers on which the operation must be 
performed. After the operation has been performed the result is 
sent to the second address. 

In what follows we shall consider only three-address instructions. 
For the sake of simplicity we shall write the signs of operations 
instead of their codes (for instance, the sign + will designate the 
code of the operation of addition) and decimal numbers of locations 
instead of their octal numbers. 

In most computers instructions are executed serially, in a conse- 
cutive order, unless the program given to the machine contains 
transfer instructions which will be discussed in Sec. 6. Before the 
machine is started, the program (which is a certain sequence of 
instructions) and initial data (that is an array of numbers on which 
the corresponding operations must be performed) are read from pun- 
ch cards or punch tape and fed into the memory. Then the con- 
trol unit takes the contents of the first memory location as an in- 
struction and sends the corresponding signals to the arithmetic unit 
which performs the operations. After that the control unit takes, 
as the next instruction, the contents of the second location and so 
on with the exception of transfer instructions mentioned above. 
When a transfer instruction has been executed the control is trans- 
ferred not to the subsequent location but to the location whose address 
is contained in the instruction. Some instructions cause the machine 
to print the contents of some locations but after the realization of 
such an instruction the machine also passes to the next location. 
(By the way, printing takes more time than performing arithmetic 
operations, and therefore the speed of computations decreases when 
the program makes the machine print very often.) This step-by- 
step procedure goes on until the control unit receives the stop in- 
struction after which the machine is brought to a stop. The machine 
is also automatically stopped if the control unit takes from the pro- 
gram an instruction that cannot be executed (for instance, if a 
number of which the square root must be extracted turns out to be 
negative). The same will be in case an overflow occurs, that is if 
the calculations result in a number which is so large that it exceeds 
the capacity of the location and cannot be written there (see Sec. 4). 
The control panel is equipped with a special device which makes it 
possible to check the contents of any location and to insert additio- 
nal data at any moment of the machine operation process. 

Each type of automatic digital computer has its own set of in- 
structions and system of coding. In Sec. 6 we shall give several 
examples illustrating the main principles of programming. The 
reader should take into account that these examples are given only 
as an illustration and that in real computers these principles are 
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realized more economically. For more detail the reader is referred 
to [8], [18], [20], [24], [25] and [27]. 

The composition of an extensive program is complicated work 
which often takes much time and requires much experience. 

6. Examples of Programming. Let it be necessary to compute the 
solution of the system of equations of the first degree 


he, Cea 
ce + dy =n 


where the numerical values of the quantities a, b, c, d, m and n are 
considered to be given. According to formulas (V1.2), the sought- 
for solution is expressed as 


md —bn an—me 
t= “Gd—be ? ~“ad—be (6) 

We now place the necessary instructions in locations with addresses 
(numbers)'1, 2, .... The total number of the required instructions 
is yet unknown. Let the six input parameters a, b, c, d, m and n 
be stored in locations having the numbers œ +1, a + 2, a+ 3, 
a+4,a-+5 anda + 6, respectively. The value of a will be spe- 
cified later on. Several subsequent locations will be used for storing 
intermediate results of the calculations. The numbers and instruc- 
tions stored in these locations before the computations are started 
do not matter because when a number is written in a location its 
former contents are automatically erased. The other locations will 
not be used in compiling the program and calculating the result. 
Let us begin with computing the first numerator in (6). To compute 
md we write the instruction 

(1) x a4+5 atá at7 
{here we shall write the serial number in front of each of the instruc- 
tions but it should be noted that these numbers are not in fact pun- 
‘ched on cards or on tape). After the instruction has been executed 
the number md is placed in location (æ + 7). The next instruction 
is of the form 

(2) X a+t2 a+6 a+8 
and its execution results in the appearance of the number bn in lo- 
cation (œ + 8). Next, the number bn [stored in location (a + 8)] 
must be subtracted from the number md [location (æ + 7)]. Since 
we no longer need these numbers the result can be written in location 
(a +7) by means of the following instruction: 

(3) — a+7 a+8 a+7 
In this example we could have used location (æ + 9) for storing 
the result md — bn but in more complicated programs it is often 
necessary to economize locations. 
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Now we similarly put down the instructions which lead to the 
computation of the denominator of the fractions entering into (6): 

(4) x .at1 a4+4 a+8 

(5) xX a+t2 a+3 a+9 

(6) — a+8 a+9 a+8 

Further, the numerator stored in location (a + 7) must be divided 
by the denominator stored in location (æ + 8), and the result should 
be printed: 

(7) ©) @t7 at8 a+7 

(8) Print a@+7 
The execution of the last instruction results in printing the contents 
of location (a + 7), i.e. the desired value of z, and then the machine 
proceeds to execute the subsequent instruction. Taking into account 
that the denominator ad — be has already been computed and is 
stored in location (« + 8) we similarly write the instructions which 
result in computing y and completing the solution of the problem: 


(9) x att a6 ati 
(10) gb abo a9 


(11) PEATE ERROA F 
(12) a e ot OT, 
(13) Print a+7 

(14) Stop 


The 44th instruction stops the machine. Thus, our program Con- 
tains 14 instructions and hence we can put æ = 14. Then the whole 
program will occupy 90 locations and will have the form 


(1) Sore tke loys rail 
(2) x 16 20 22 
(By) ieee 2 24 
(4) P= 1b) ASN 22 
(5) A ONAT 
(6) = 22 238 22 
(7) OTe en See 
(8) Print 24 

(9) Pome Ag) 20) sak 
(10) E T 
(4) — ai 23 af 
(12) M a pet 
(13) Prin 24 

(14) Stop 

(15) @ 

(16) b 

(17) c 

(18) d 

(19) m 


(20) n 
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Suppose that the program is to be punched on punch cards each 
of which contains 12 rows (which carry numbers or instructions). 
Then it is advisable to take a greater value of a, for instance, œ = 
= 24. The matter is that if we put œ = 24 all the instructions will 
be placed on the first two cards and all the input data on the third 
one because æ + 1 = 25. This will enable us to replace only the 
third card if the initial data change and to use the first two cards 
repeatedly. 

Thus, to store the intermediate results of the program we need 
only three locations; if œ = 14 these are the 21st, 22nd and 23rd 
locations. Now we must punch the program on cards and start the 
machine. (In reality there are some additional instructions which 
should also be written in the program but they do not matter for 
our illustrative purposes. For instance, these are the instructions 
according to which the program is read from the cards by the input 
unit and introduced into the memory, the instructions of converting 
the foe from the binary-decimal code to the binary system 
etc. 

Now we can easily illustrate a program of calculating the values 
of a function (see Sec. 1.13). For instance, let the function y = z? — 
— 3x +7 be considered. Verify that the program 

leis 409 me 8 


E e ies oy AER RN 
Kohar RO 8 
Wee ae IB 
(5) Stop 
(6) 3 
EAR 


causes the machine to compute the value of y which corresponds to 
any value of z stored in location « and to place it in location ß. 
Such a program is easily reproducible and ready for application 
after being punched. There are certain rules of mathematical opera- 
tions on such programs including integrating functions, finding 
extrema and the like. Development of computer techniques will 
undoubtedly lead to extensive replacement of formulas by the 
corresponding programs in many divisions of mathematics and its 
applications. J 
In the above examples the number of instructions in the program 
equals the number of the necessary operations. But modern digital 
computers are essentially intended for performing calculations in- 
volving thousands and millions of operations. It is clear that in 
such a case we cannot write down all the instructions corresponding 
to the operations. Fortunately, in programs involving a great number 
of calculations most intermediate operations are carried out many 
. times according to one and the same scheme. This makes it possible 
to form loops in programs which cause the machine to repeat one 
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and the same part of a program several times. Loops are formed by 
means of the conditional transfer instruction which is of the form 


Transfer of control N, N Ns 


(It is apparent that instead of the words “transfer of control” we 
must in fact write the code of the operation.) There are many vari- 
ants of the realization of the instruction. For definiteness, we assume 
the following interpretation of the above instruction. Let the numbers 
N,, N, and Ns designate the addresses of the corresponding locations 
and let (,)’, (N2)’ and (N,)’ be the contents of the locations. Sup- 
pose that the instruction causes the machine to compare the number 
(N,)’ with the number (V.)’ and proceed to execute the subsequent 
instruction if (Ny)’ < (N2) or the instruction stored in location 
Nz if (N,)' > (No); the contents of the locations remaining unchan- 
ged in either case. In particular, the instruction 


Transfer of control 1 1 Ns 


causes the machine to execute the instruction stored in location .Ng 
after the above instruction has been read by the control unit. This 
is the so-called unconditional transfer instruction. 

As an example, let us compose a program. for printing the table 
of reciprocals of 1000 successive natural numbers taken, for in- 
stance, from 2001 to 3000. Such a program formed without the trans- 
for instruction would be very extensive, but it becomes in fact 
rather short if the instruction is used. Let us place the number 2000 
in location (œ + 4) and the number 1 in location (æ + 2). Besides, 
let us place the number 2999 in location (œ + 3) (soon we shall see 
why the number is introduced). Let the first instruction have the form 

(4) +  qeee a+2 a+1 i 
(we again place the instructions in locations 1, 2, .. .). After it 
has been executed the number 9001 substitutes for the number 2000 
in location (œ + 1). The next two instructions result in calculating 
the inverse of 2004 and in printing it: 

(2) > Mgt? Bark dt a+4 
(3) Print @--4 : 
Now we write the instruction 

(4) Transfer of control ats a-+1 1 

This instruction compares the number stored in location (% + 1) 
with the number 2999 and transfers the control back to location 4 
since the number 2001 placed in location (œ + 1) is less than 2999: 
(a +41) = 2008 < (a + 3)’ = 2999. Thus the machine again exe- 
cutes the first instruction which results in the appearance of the 
number 2002 instead of 2001 in location (œ + 1). Then the instruc- 
tions (2) and (3) are executed and thus the inverse of 2002 is compu- 
ted and printed. Next, taking the fourth instruction from the storage 
and executing it the control unit causes the machine to compare 
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2999 with 2002 and to pass to the execution of the first instruction 
again, etc. Only after the recurrent addition of unity to the contents 
of location (œ + 1) results in the appearance of 3000 and the inverse 
of 3000 is computed and printed, the fourth instruction will trans- 
fer the control to the next instruction since then we shall have 
(a + 3)’ = 2999 < (œ + 1)’ = 3000. Then the machine must be 
stopped because all the desired results will have been printed, and 
therefore the fifth instruction is 

(5) Stop 

Thus, we can put œ = 5, and hence the whole program will have 
the following form: 


(4) 0 + GI TELO 
(2) : he Ge 9 
(8) Print 9 

(4) Transfer of control 8 6 1 
(5) Stop 

(6) 2000 

(7) 4 

(8) 2999 


We have used only one location 9 for storing the intermediate results 
but at the same time the contents of location 6 have been changed 
1000 times from 2000 to 3000 (with step 1) in the process of the cal- 
culations. 

We now consider a variant of the above program which results 
not in printing the reciprocals but in placing them into the locations 
of the memory with the numbers ranging from 10 to 1009. These 
stored numbers can be used for some further calculations; of course, 
we suppose that the memory capacity makes it possible to place the 
numbers. This program has the following form: 


(1) -+ Geert 
(2) + Cee ees) 
3 : RGR SO 
(4) Transfer of control 8 6 14 
(5) Stop 

(6) 2000 

(7) 4 

(8) 2999 


(9) (In this location 4 is the last binary digit, and all the other 
digits are equal to 0) 

Here we have placed an auxiliary number in location 9 which has 
no quantitative meaning and serves only for modifying instructions. 
This is quite a new operation which is performed in the following 
way in our program: after the second instruction has been executed 
the first time, the third instruction is sent to the arithmetic unit 
where it is transformed into . 


Oy. font O 
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[By the way, it should be noted that the sign + in instructions (1) 
and (2) designates the operations of addition which are performed 
according to different rules and therefore they have different codes. 
For more detail the reader is referred to the books enumerated above.] 
The third instruction taking the above form, its execution results 


in placing the number =a in location 10. After the repeated execu- 


tion of the second instruction, the third instruction is modified 
again and takes the form 
6 41 


Therefore the second execution of instruction (3) causes the control 
unit to send the number a% to location 11 and so on. Thus we have 
encountered here the operation of modifying an address entering 
into an instruction. 

Hence, it is possible to perform operations on instructions thus 
automatically modifying them in the process of work of the machine. 
This obviously extends the application of digital computers. 

The conditional transfer instruction is used not only in employing 
loops but also when it is necessary to introduce the branching in- 
struction which causes the computer to perform different sequences 
of operations depending on some circumstances unknown beforehand. 
For example, suppose that a number a must appear in a certain loca- 
tion B, and let it be necessary to retain it unchanged if a > Oand 
square it and store the result in the same location B ifa<0. In the 
zeroth location the number 0 is usually placed, and therefore we can 
realize the desired procedure by writing in the corresponding place 
of the program the following instructions: 


(k-+1) x Ber pre. .p 

Branching is then performed automatically, and when the program 
has been executed we shall not even know which variant has been 
realized unless a special instruction causing the machine to output 
the information concerning this procedure is introduced into the 

program. 3 
Finally, let us consider an ex 
total number of operations 1S no 
sary to solve the cubic equation 
z = 0.12 +1 


by applying the iterative method (see Sec. V.3) and beginning with 
the initial approximation Zo = 0. To do this we place the number 
0 in location (& + 1) (this practically means that the corresponding 
row of the punch card is not punched at all), the number 0.4 in 


ample of a program in which the 
t set beforehand. Let it be neces- 
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location (a + 2) and the number 1 in location (œ + 3). The succes- 
sive approximations appearing in the process of calculations will 
be placed in location (œ + 1). 

After a certain approximation has been computed the calculations 
yielding the next approximation can be carried out according to the 
following instructions: 

(4) x att a+t1 a+4 

(2) x a+4 a+1 @+4 

(3) x at2 a+4 at4 

(4) + a@+4 a+3 at4 : 

Thus, the new approximation will be placed in location (a + 4). 
It must be compared with the preceding approximation stored in 
location (œ + 1). If the approximations differ the result must be 
transferred to location (œ + 41) and then the iteration should be 
repeated. If the approximations coincide the result must be printed 
and the machine must be stopped. This can be realized by means 
of the following instructions (check!): 

(5) |—| @+1 @4+4 ati 
[the execution of this instruction results in sending the absolute 
value of the difference between the contents of locations (« -+ 1) 
and (œ + 4) to location (a + 1)] 

(6) Transfer of control 0 at1 9 


(7) + at4 0 a+1 
(8) Transfer of control 1 1 4 
(9) Print a+4 

(10) Stop 


Consequently, we can put œ = 10 and write down the whole pro- 
gram which occupies 13 locations. The program will cause the ma- 
chine to perform the iterations until a subsequent approximation 
coincides with the preceding one (note that if the iterative process 
converges this aim is necessarily achieved because the results of 
the calculations are automatically rounded off). Then the machine 
will print the result and stop. It should be noted that if the iterative 
process does not converge either an overflow of locations will occur 
or the machine will go into a closed loop, that is repeat a certain 
sequence of instructions over and over again. In the latter case the 
machine cannot stop on its own, and we must stop it by pressing 
the stop key on the keyboard. 

Programs for solving more complicated problems can he very 
extensive. But they often include some simpler problems, for in- 
stance, such as the problem of computing the value of the sine for 
a given value of the argument entering into the main problem and 
so on. These simpler problems are encountered very often and it is 
therefore expedient to have special subroutines for solving them. 
These subroutines are composed beforehand and stored in certain 
locations in the external or internal memory. When a problem of 
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this kind is encountered in solving a more complicated problem a 
special instruction is introduced into the program which makes 
the control unit take the corresponding subroutine out of the memory. 

To facilitate programming, several programming languages for 
communicating with computers have been recently developed. 
A program can be written in these languages so that all the necessary 
procedures can be readily and accurately expressed in terms which 
are closer to the language of mathematics than those described above. 
A program written in such a language is independent of the type of 
the machine we use and is printed by means of a device resembling 
an ordinary typewriter. Then the program is automatically trans- 
lated into the machine code by means of a special processor. Among 
these languages we mention the FORTRAN language (FORmula 
TRANslating system) and the ALGOL 60 (ALGOrithmic Language 
developed in 1960). The introduction of the languages in practice 
will make the application of computers available for many scien- 
tists and engineers. By the way, in some cases an experienced pro- 
grammer can compose a program in the machine code more econo- 
mically without using a computer language because he takes into 
account some peculiarities of the concrete machine. This is especial- 
ly important when we have to economize machine time. 

The errors occurring in a calculation process can appear because 
of the mistakes in composing or punching the program or due to mal- 
functions of the machine itself. The latter can be systematic (for 
instance, when some elements get out of order) or random (when an 
element passes from one state to the other on its own, at random). 
Systematic errors are checked by means of built-in checks or sup- 
plementary programmed checks based on solving certain problems 
with the answers known in advance. The correctness of composing 
a program which is to be introduced into the machine is checked 
before the machine is started. When checking a program we usually 
make the machine execute some parts of the program and try to 
roughly estimate certain intermediate results which is very impor- 
tant because these results must not exceed the capacity of locations 
of the memory (see Sec. 4). To check the punching we usually repe- 
atedly punch the program and cause the machine to compare auto- 
matically the corresponding cards from the two decks with one ano- 
ther. To detect random malfunctions we can apply some well-known 
arithmetical rules used for checking the results of calculations. In 
more important cases the result is computed repeatedly in order 
to compare the answers. Besides, computer designers try to elimi- 
nate the possibility of random malfunctions by perfecting the ma- 
chine in the process of designing, manufacturing and modifying 


its components. 


APPENDIX 


Equations of Mathematical 
Physics 


Equations of mathematical physics are mainly partial differential 
equations describing various physical processes. The theory of these 
equations is an important division of mathematics with many ap- 
plications to physics, engineering and other branches of science. 
Here we shall present some elementary facts of the theory. 


§ 1. Classical Equations of Mathematical Physics 


1. Derivation of Some Equations. Let us consider the process of 
longitudinal vibrations of an elastic rectilinear homogeneous bar. 
We shall draw the z-axis along the bar and denote by x the coor- 
dinate of the corresponding point of the bar in the state of equilib- 
rium when the bar is unloaded. Let u = u (x, t) be the longitudinal 
displacement of the point z at moment f. In investigating the pheno- 
menon we shall assume the hypothesis of plane sections, that is we 
shall suppose that the plane sections of the bar move in the process 
of vibration in such a way that they remain parallel to their initial 
positions all the time (practically this means that the deviations 
from this condition are inessential and can be disregarded). 

The function u (z, t) determines the law of vibrations of the bar. 
It should be noted that the term “vibrations” is understood ina 
conditional sense here because the process of deformation of the bar 
is an oscillatory motion only in certain simpler cases whereas in 
the general case the motion can be of a more complicated nature. 

In a vibration process there appear elastic stresses in the cross 
sections of the bar. We shall suppose that the corresponding defor- 
mations lie within the limits of applicability of well-known Hooke’s 
law (discovered in 1660 by the English mathematician and inventor 
R. Hooke, 1635-1703). The law states that unless a certain limit 
is exceeded the normal stress o is directly proportional to the longi- 
tudinal elongation £, i.e. 6 = Ee where E is Young’s modulus cha- 
racterizing the properties of the material. Therefore the stress can 
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be readily expressed in terms of the function u because the longi- 
tudinal elongation of an element dx of the bar is equal to oat = c 

3 T z 
(see Fig. 352) and thus we have 


OnE = a) 


Now we can write down the equation of motion of the element dx 
of the bar. Let us suppose, for generality, that the bar is subjected 


utu 


u 


Se 
0 g adr L g 


Fig. 352 


—— element dx in free equilibrium state 
Z —— the same element in the process of vibrations 


to a longitudinal distributed load of intensity p = P (x, ġ. Let F 
denote the constant cross-section area of the bar. Then the resultant 


force acting upon the element is equal to 
P 
[oF + _(oF)|—oF + pde=F = de + pde= (rE Zz +P) dz 
Denoting the density of the material of the bar by p we write, on 
the basis of Newton’s second law, the equation 
‘Pu 


82 
(FES +?) dx = pF dz 
Finally, dividing by pF dx and introducing the notation 


E pz 
E pF =f (z; t) 
we derive the equation 
Pu q SE Hi (E t) 2 
3m (Ox? ? 
where f (z, t) is the given function and u = u (a, t) is the sought- 
for one. 


Let us derive another important equation, namely, the heat equa- 
tion describing the process of propagation of heat in a medium. 
We shall restrict ourselves to the case of a homogeneous isotropic 
medium. Let us denote by u =u (x, Y, 4, t) the temperature at 
the point (£, Y, z) of the medium at moment ż. In deducing the equa- 
‘tion which must be satisfied by the function u we shall apply the 


Coulomb law which states that at each point of a medium the thermal 
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energy is transferred in the direction of —grad u (which means 
that the heat propagates from the regions of higher temperature to 
those of lower temperature) and that the speed of the propagation 
is proportional to | grad u |. This law holds with a sufficient accu- 
racy when the variations of the temperature are comparatively 
small. To put down the mathematical expression of the law we take 
into account that it means that the quantity of heat passing through 
a surface element (dS) during the time period dé is equal to 


dQ = k grady u dS dt 


where k is the coefficient of heat conduction and n is the outer unit 
normal vector to the area. Then the total quantity of heat passing 
during time dt into the interior of a solid (Q) with boundary surface 
(S) is equal to : 


k § grad, udS dt = k $ grad u-dS dt 
(8) (5) 
(compare this with Sec. XVI.22). Applying Ostrogradsky’s formu- 
la (see Sec. XVI.23) and considering the volume (Q) to be small we 
obtain (compare with Sec. XVII.22), to within infinitesimals of 
higher order, the relation 
dQ =k { div grad u dQ dt = 
(Q) 
En u , u , Pu Au , u , Pu 
=r | (atat Tr) dQ dt k( tsar tia ) Qat 


Ox? 


Further we shall write (dQ) instead of (Q). 

Now we proceed to write the equation of heat balance for the 
element of volume (dQ). Let us suppose, for generality, that there 
are sources of thermal energy in the medium distributed with densi- 
ty q =q (z, y, Z, t). The quantity of heat in the volume (dQ) is 
equal to cpu dQ where c is the specific heat and p is-the density of 
the substance. It follows that 


92 2 2 
01 (cpu dQ) =k (E+ a a T) dQ dt +qdQdt 


Papas: out co dQ dt in both sides and introducing the notation 
F = a (a is the coefficient of temperature conductivity) and 2 = 


we arrive at the heat equation 


Ou Pu Pu 


du 
al (Satoetoe) ti (2, Y, 2, t) (3) 


where the function f (x, y, 2, t) is given and u = u (z, y, 2, t) is 
the sought-for function. 
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If there are no sources of thermal energy in the part of space under 
consideration equation (3) takes the simplified form 


a au , 
ene (Sata t Ge) (4) 


2. Some Other Equations. It turns out that various physical pro- 
cesses are described by means of similar differential equations. 

For instance, equation (2) describes not only the process of longi- 
tudinal vibrations of a bar but also the process of small transverse 
oscillations of a taut string (in this case u denotes the transverse 
deflection; see Sec. XVII.31), the process of longitudinal oscilla- 
tions of a gas in a tube (in this case u is the pressure or the density), 
the oscillations of an electric current in a wire with distributed 
resistance and inductance when there are no energy losses etc. 

The equation 

Pu ĝu 

gan gA (5) 
is of particular importance. It describes free one-dimensional oscil- 
lations of a homogeneous medium, whereas equation (2) describes 
forced oscillations. In the case of a non-homogeneous medium the 
equation becomes more complicated. 

These equations should be accordingly changed in the case of two- 
dimensional and three-dimensional vibration processes. For instance, 
the equation of vibrations of a homogeneous isotropic three-di- 
mensional medium is of the form 

Pu 


A a (Zii r) tAn s i) (6) 


in the case of forced oscillations and of the form 


Pu 2 | @u Ou F) 
oe =O E t att aa 


in the case of free oscillations. These equations are also satisfied 
by the projections of the vectors of electric and magnetic field 
intensities, by the projections of the displacement vector of elastic 
vibrations of a body and so on. p 

Equations (3) and (4) are of another type. We have seen that 
these equations are satisfied by the temperature in the process of 
heat transfer in an isotropic homogeneous medium. It can þe shown 
that-the same equations describe the diffusion processes if u denotes 
the density of a diffusing substance. Equations (3) and (4) can also 
be considered in a plane or on a straight line. For instance, in the 
latter case the equations take, respectively, the forms 


& La St +f (a,t) (7) 


784 INTRODUCTORY MATHEMATICS FOR ENGINEERS 


and 


du ĝu 
TIS & Ox? (8) 


If the process in question is stationary both the sought-for func- 
tion u and the function f describing an external action must be in- 
dependent of t. Then from (6) we obtain the equation 

ĝu ĝu Pu 4 a 

Sata t a al y, 2) = fi (z, y, 3) (9) 
which is referred to as Poisson’s equation. When, in a part of space, 
there are no external actions, we obtain the equation 


au ĝu , du 
ja tant am —° (10) 
called Laplace's equation. Equations (3) and (4) yield the same equa- 
tions (9) and (10) in the stationary case. 

Here we have put down only those equations of mathematical 
physics which are most thoroughly studied. There are many other 
equations describing various phenomena. 

3. Initial and Boundary Conditions. We showed in Sec. XV.2 
that every ordinary differential equation possesses infinitely many 
solutions and that to isolate a concrete solution we need some sub- 
sidiary conditions. The same is true for partial differential equations 
for which we usually set initial or boundary conditions or both as 
the subsidiary conditions. ; 5 

We now return to the problem of longitudinal vibrations of a bar 
(see Sec. 1). It seems natural that to obtain a concrete process of 
vibrations we must set the initial conditions specifying the initial 
state of the bar. From the point of view of physics the initial state 
is completely determined by the corresponding initial displacements 
and initial velocities of the points of the bar [compare with condi- 
tions (XV.10)]. Thus, for equation (2), we write the initial conditions 


ufico=o(z) and S| = (2) (11) 


where the functions and are given [co r ith formulas 
(XVII.133)]. x 5 Š pompare iit 

If we investigate a portion of the bar which is placed so far from 
its ends that their influence is inessential during the time period 
in question (in other words, if it is possible to regard the bar as 
being infinite) it is sufficient to know only initial conditions (14) 
in order to determine the solution. In this case we arrive at a pro- 
blem with initial conditions alone, i.e. Cauchy’s problem. The ini- 
tial conditions and Cauchy’s problem are of a similar type for equa- 
tion (6) in a plane or in space [equation (6) is referred to as the wave 
equation]. The initial conditions for heat equation (3) or (7) differ 
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from those of (6) because the physical meaning of the problem in- 
dicates that to specify uniquely a process of heat transfer it is suf- 
ficient to specify only the initial distribution of temperature and 
therefore it is only the initial values of u that should be set for t = 0. 

If the influence of the ends of the bar cannot be neglected then 
besides the initial conditions we must set certain boundary conditions 
which describe the processes on the boundary of the medium in 
question, i.e. at the ends of the bar in our case. Boundary conditions 
can be of different forms depending on the processes, and they should 
be set for both ends (z = 0 and z = J) independently. For instance, 
let us consider the left end. 

It can be rigidly fixed and then we have 


ulx-o = 0 (i.e. u (0, t) = 0) (12) 


A more general condition can describe the motion of the left end 
according to a given law of the form 


u lz=0 = x (0) (13) 


where the function x(t) is given. Condition (13), and its special 
case (12), is called a condition of the first kind. 

If the left end is free the condition must express the fact that there 
is no normal stress there. Thus, by (4), we have 


du 

Oz x=0 A 
This condition [and also a more general condition = nao =% (BH) 
is called a condition of the second kind. 

In the case of an elastic fixing of the left end the normal stress 
in the end point section is proportional to its displacement, that 
is we have ø = ku at the end-point. From this, on the basis of for- 
mula (1), we deduce the condition 

Ou k BE OG = 2 k 
(Ear kozte au) Bs 0 («=7) 4) 
which is referred to as a condition of the third kind; the same term 
is applied to the corresponding non-homogeneous condition. (Let 
the reader verify that in the case of the right end the similar condi- 
tion contains + instead of — in front. of a.) 


Thus, if, for instance, the left end of the bar is rigidly fixed and 
the right end is free the boundary conditions are of the form 


Ou 


oe |z=0 ii 0, Oz z=l 


The problem of solving a partial differential equation when both 
initial and boundary conditions are given is referred to as a mixed 
problem (boundary-initial-value problem). 


50—0141 
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If an equation is solved in a three- or two-dimensional domain 
a boundary condition of the first kind reduces to specifying the 
values of the sought-for function on the boundary of the domain 
and a condition of the second kind consists in setting the values 
of the derivative of the function along the normal to the boundary. 

Thus, if the influence of the boundary of the domain is essential 
the equation of the non-stationary process in question must be solved 
for given initial and boundary conditions. It is apparent that in 
the case of the equation of a stationary process [for instance, equation 
(9) or (40)] we set only boundary conditions. In the latter case, for 
a condition of the first kind, the problem is called the Dirichlet 
problem or the first boundary value problem (after the German ma- 
thematician P. Dirichlet, 1805-1859). Accordingly, for a condition 
of the second kind it is called the Neumann problem or the second 
boundary value problem after the German mathematician K. Neu- 
mann (1832-1925) who for the first time systematically investiga- 
ted the problem in 1877. 


§ 2. Method of Separation of Variables 


There are many methods of solving equations of mathematical 
physics. Here we shall discuss one of the most important methods 
referred to as the method of separation of variables. 

4, Basic Example. Let us consider the problem of solving the 
equation 

oe ae 
a ae 


O<r<l, 0<t< 0) (15) 


for the simplest boundary conditions 
Ulno=0, wlar=0 (0<t< oo) (16) 


and initial conditions 

a 
ulio=0(2), Z| =v(2) 0<) (17) 
where @ and p are some given functions,* 

We shall interpret the function u = u (x, t) as the transverse 
deflection of a taut vibrating unloaded string (see Sec. XVII.31) 
fixed at the ends (x = 0 and x = J) and the functions @ and p in 
conditions (17) as an initial deflection and an initial velocity. We 
can, of course, interpret u (x, t) as a longitudinal displacement 
of the point x of a bar and the like. 


* In Sec. XVII.31 we solved this problem [see equation (121) and condi- 
tions (132) and (133)] by means of expansions in Fourier series. The method 
of separation of variables, as we shall see, reduces to Fourier expansions and 
yields the same result [see formulas (25) and (XVII.135)].—7r. 
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The main idea of the method of separation of variables is that 
we seek for a solution of equation (15) of the special form 


u (z, t) = X (x) T (d) (18) 


under boundary conditions (16) (but without any initial conditions). 
Hence, we are interested in a solution which is the product of a 
function dependent only on z by a function dependent only on ?. 
If we want to investigate the form of the string corresponding to 
solution (18) at the subsequent 

moments ti, tz fs, .. . we must WS 
multiply the fixed function X (z) 
by the constant factors Titi, 
T (ts), T (t), . . . - Therefore the 
zeros of the function X (x) re- 
main all the time the zeros of 
the function u (s, t) (and are 
referred to as the nodes). Accor- J 
dingly, the points of extrema ; 4 
of the function X (x) remain . 35 


the points of extrema of u (z, Ù 
(these are called the antinodes). The form of the vibrating string at 


the successive moments of time is depicted in Fig. 353. A vibratio- 
nal state of peculiar type (18) is called a standing wave. The fun- 
ction X (z) describes the form of the standing wave and T (2) 
expresses the law of its variation in time. Hence, we have posed 
the problem of finding the standing waves which are possible for 
the given boundary conditions. 5 
The substitution of (18) into (15) results in 
f TQ X" 
X (a) T” (t) = @X" (x) T(é), ie. aoe 


The left-hand side being independent of x and the right-hand side 


being independent of t, the last equality can be fulfilled if and only 
if both sides contain neither z nor t. Hence, | 
the same constant which we denote by —A (but A itself can be of 


an arbitrary sign): 


E A 
era LO 


It follows that T (t) and X (x) must satisfy the corresponding ordi- 
nary differential equations 

T" (t) + PT (t) = 0 (19) 
and 

X" (x) + dX (x) = 0 (20) 


Thus, the independent variables have been separated! 


50* 
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< Now recall that we have imposed conditions (16) on solution (18). 

The first condition (16) yields X (0) 7 (£) = 0 which implies X (0) = 
= 0 [if X (0) 0 we must have T (t) == 0 and then u (z, t) =0 
which does not yield a standing wave]. We similarly consider the 
second condition (16) and thus arrive at the boundary conditions 
for a standing wave: i 


X (0) =0, X (J) =0 (21) 


Consequently, to find all the possible forms of standing waves we 
must solve equation (20) with boundary conditions (24) (a similar 
problem was treated in Sec. XV.16 but here we shall not use the 
results obtained there). It is clear that the function X = 0 satisfies 
both equation (20) and conditions (21) for any A but we are not in- 
terested in the function since it does not yield a standing wave. 
Thus, the only solutions we are interested in are those satisfying 
the condition X (x) 40. Such solutions, generally speaking, may 
not exist for all A. The values of A for which these solutions exist 
are called the eigenvalues of problem (20) with conditions (21). 
The solutions X (z) are called the eigenfunctions of the problem 
corresponding to these eigenvalues. 

We first suppose that 1<0, i.e. A = —v?. Then equation (20) 
has the general solution 


X (x) = Cyer* + Cre 
(check it up!), and conditions (24) imply that 
Cy+C,=0 and Cyer' + Cem = 0 
It follows that Cs = —C, and therefore 
Cey — Cye-v' = 0, ie. C (ce?! — 1) ew = 0 


But the second and the third factors are different from zero (why?). 
Hence C; =0 and consequently C, =0 and X (x) =0. Thus, 
there are no negative eigenvalues of this problem. Let the reader 
prove that the value A = 0 is not an eigenvalue either. 

Now let A = k? >0. Then equation (20) has the general solution 


X (z) = C; cos kx + Cy sin kr (22) 
(check it up!). Conditions (24) imply that 
C;=0 and Csin kl =0 (23) 


But we must have C = 0 (why?) and hence sin kl = 0. From this 
we deduce 


kl=nn and kak (= BOP es) 
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Thus, the problem has the following infinite set (spectrum) of eigen- 
values: 


N= An = (74)? et o a 
The corresponding eigenfunctions are implied by (22): 


Xn (e)=sin = (n=1, 2, «.-) (24) 

We have not put down the factor C, here because any constant fac- 

tor can be included into T (#) 

in the expression (48). X(T) 
Substituting the value A = 

= in (m= S ye ES 

found into equation (18) we get 


T (1) =Acos“* t+ Bsin 


n=1 


t 


ant 
l 


where A and B are arbitrary 
constants. Thus, we have obtai- 
ned harmonic vibrations with fre- Fig. 354 


an n 
quency On =F - Consequently, 
according to (18), the sought-for standing waves are of the form 


nar 
l 


ant 
l 


u (z, t)= (4 cos t+ B sin “ t) sin Ea ET) 
(The first three standing waves corresponding to n = 1, 2, 3 are 
shown in Fig. 354.) The frequencies of vibrations of these waves 
are equal to 


an 2an 3an 
0 =n., w= = 201, O3 = Į = 301, 


As it is said in acoustics, the first standing wave corresponds to 
the fundamental tone and the subsequent standing waves whose 
frequencies are 2, 3, 4 ete. times the frequency of the fundamental 
tone determine the overtones (see Sec. XVII.23). 

Let us now construct the general solution of equation (15) with 
boundary conditions (16) which must describe the general form 
of vibrations that are possible under the given boundary conditions. 
This general solution is obtained as a combination (superposition) 
of the above standing waves with different amplitudes, i.e. it has 
the form 


o 


u(z) )= >) (An cos Sty Ba sin t) sin a (25) 


n=1 
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where A, and B, (n = 1, 2, 3, . . .) are arbitrary constants. Equa- 
tion (15) and boundary conditions (16) being linear and homogene- 
ous, the whole sum (25) satisfies them because each summand satis- 
fies the equation and the conditions (compare with property 4 in 
Sec. XV.14). To prove that formula (25) represents the general solu- 
tion we must show that the arbitrary constants A, and Bẹ, can be 
so chosen that any initial conditions (17) should be satisfied. Note 
that in contrast to an ordinary differential equation whose general 
solution contains a finite number of arbitrary constants the general 
solution of problem (15), (16) depends on two infinite sequences of 
arbitrary constants or, which is the same, on two arbitrary functions, 
namely, the functions ọ (x) and p (x) entering into conditions (17). 

To satisfy the first condition (17) we put t= 0 in (25) which yields 


p(c)= >) Ansin = (0<2<l) (26) 


n=1 


Thus, the given function ọ (z) must be expanded into a series in 
eigenfunctions of problem (20), (24). In this particular case this is 
nothing but an expansion into a Fourier series of form (XVII.107). 
‘As it was shown in Sec. XVII.22, expansion (26) is possible, and 
the coefficients of the expansion are found on the basis of the ortho- 
gonality condition: 

l 


f 9 (x) sin = dx ; 
0 2 a nats 
A, i aT j ọ (x) sin — dx (27) 
f sin? dr 9 
l 


0 


To satisfy the second condition (17) we differentiate both sides of 
equality (25) with respect to tand then put ż= 0: 


o 
p (z)= Jj Bn A sint 


n=1 


After a manner of (27) we obtain the relation 
L 
2 
Bn =z | (a) sin Fae (28) 
d 
(check up the calculations!). 
Thus, the solution of the original problem (45)-(17) is given by 


formula (25) in which the coefficients are defined by formulas (27) 
and (28). 
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5. Some Other Problems. 

1. Suppose that instead of equation (15) we consider homogeneous 
heat equation (8) with the same simplest boundary conditions (16) 
and with initial conditions of the form 


ul =e) OSS) 
(see Sec. 3). After substitution (18) has been performed and the 
variables have been separated we arrive at the same problem (20), 
(24) whose solution yields the eigenfunctions and the eigenvalues 
(let the reader perform the calculations!). But instead of (19) we 
obtain the equation 
T' (t) + aT (t) = 0 
for the function T (t). It follows that 
T(t) = Ae™ 


Therefore we obtain the formula A 
an?a? t 


co 

i e : ne 

u (ty = Dy Ane Tee Beas 
n=1 


which expresses the general solution of the equation under the given 
boundary conditions. This formula substitutes for expression (25) 
in this case. The coefficients An can be found here by means of the 
same formula (27). : 
We see that the set of the eigenfunctions and eigenvalues remains 
the same as before in th 
of a different type here. Instead of ha 1 
we had in Sec. 4 we obtain exponentially damp! 
case (which results in jim u = 0). By the way, 


ing of the problem in question implies that the function u (z, 4) 
must behave in this peculiar way as t increases (why?). 
Now take the Laplace equation 


02 ôu 

tare O<e<l, 0<y<m) 
with the conditions 

u |e=0 = 0; Ule =0 0<y<m) 


Let the reader verify that the same method yields the general solu- 
tion of the form 


ais eae aN ie srs 
wae, y= > (Ane 7 + Bre ! ) sin l 
n=1 
for this problem. The constants A, and B, can be specified if some 
subsidiary boundary conditions for y = 9 and y =m are given 
(then we substitute y = 0 and y = m into the above solution etc.). 
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It should be noted that such a separation of variables is by far 
not always possible for all the partial differential equations. 

2. We now turn back to equation (15). Suppose that instead of 
boundary conditions (46) we have conditions of another type. For 
instance, let us take the conditions 


du 


u [x=0 = 05 RS 


= 

l 

This results in a change of conditions (21) specifying the eigenfunc- 
tions. The new conditions will be of the form 


XO 0 ex (2) 0 
Then instead of (23) we arrive at the relations 


Cy => 0, Cok cos kl = 0 
which imply 


* 
kl= —4-n0 (15257 <3) 


Therefore the eigenvalues and the eigenfunctions of this problem 
are expressed by the formulas 


bam (=E ma) atmi [(- Fa) + 


Hence, the spectrum of eigenvalues (together with the spectrum of 
frequencies) and the set of eigenfunctions have changed. The first 
four eigenfunctions are depicted in Fig. 355. It is easy to directly 
verify that the eigenfunctions are orthogonal to one another. lt 
can be proved that the eigenfunctions of the problems of this kind 
always form a complete system of functions. Therefore the solution 
of such problems can be completed by means of techniques similar 
to those given in Sec. 4. 

We often deal with more complicated equations when finding 
eigenfunctions of a problem. For instance, take the boundary con- 
ditions of the form 


u |x=0 =90, (2 = au) 


[see formula (14)]. Then instead of (23) we obtain 
Cı =0, Cz (k cos kl + a sin kl) = 0 
which implies tan kl = — Li Let us denote kl by p. The equation 


for determining u is of the form tan p = — 5 . A method of graphi- 
cal solution of the equation is illustrated in Fig. 356. The graphs 
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clearly indicate that there is an infinite sequence of positive solu- 
tions m < p < ys <.. This implies 
5 j 
n=, Xn (z)=sin $Z (n=1, 2, ---) 
In this case the eigenfunctions also form a complete orthogonal 
system of functions. 

3. Let us consider equation (2) describing the process of forced 
oscillations of a string under the simplest boundary conditions (16) 
and initial conditions (17). (By the way, the problem can be treated 
similarly in the case of boundary conditions of other types.) Here 


Fig. 355 Fig. 356 


the first stage is to determine the set of the eigenfunctions of the cor- 
responding homogeneous problem. This was, performed in Sec. 4 
and we can therefore apply. the results obtained there. After that 
we expand tho functions a a," ) amd y Gy.) mato Betles +0 the eigen- 
functions for any fixed value of and obtain 


iG hoe eee (29) 
and pie 
f(a, = py H, (t) sin -= (30) 
n=i 


The coefficients Hn (t) are immediately found as the Fourier coef- 


ficients of the given function f (z, t); 
1 


Hd =4 f f(a, t)sin de 


But the coefficients Tn (t) are unknown here; they are the sought- 
for quantities of our problem. 
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Substituting (29) and (30) into equation (2) we obtain 
i ” $ 7 zs m \2 oe THY 
>) T(t) sin =. =@ Tr (t): (=) (—sin ~)+ 
n=1 


n=1 


+ >} Hn () sin = 


n=1 

Now, equating the coefficients in similar eigenfunctions we find 
” 2 ( 
Tat (E) Tr=Hnlt) O<t< o) (31) 


Besides, if we substitute t= 0 into (29) then, according to initial 
conditions (17), its left-hand side must be equal to @ (2). Consequent- 
ly, the values 7, (0) of the functions Tn (£) are equal to the coeffi- 
cients of the expansion of the given function ọ (x) into a series in 
the eigenfunctions. This enables us to easily find these values: 

L 

Tn (0) =F | 9 (@) sin 27 de (32) 

0 
Similarly, differentiating equality (22) with respect to | and sub- 
stituting t= 0 after that, we obtain 

L 


7, (0) == | W(x) sin de (83) 
0 


Thus, to find 7, (¢) we must take initial conditions (32), (33) and 
solve the ordinary non-homogeneous linear differential equation 
with constant coefficients (31). The solution of the equation can be 
easily found by means of the method of variation of arbitrary con- 
stants [see Sec. XV.15 and, in particular, the solution of equation 
(XV.87)]. Substituting the coefficients thus found into (29) we ob- 
tain the sought-for solution. 

4. Let us now take a problem in which not only the equation but 
also the boundary conditions are non-homogeneous. For instance, 
let the problem be of the form 


62 2, 
St S24 (2,1) OK< 0<t<~) (34) 
uļ=0=% (t) Ulrar=Yo(t) (O<t < 0) (35) 


ô 
u lt=0 = 9 (z), = 


5YE) O<z<)) (36) 


This problem can be reduced to a problem of the type considered 
above. For this purpose we take an arbitrary function g (zx, t) satis- 
fying the boundary conditions, i.e. conditions (35). For example, 
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we can put g (z, t) = %& (t) + + Ixa (£) — x (O)) or choose g (x, t) 
in any other way. Then we replace the sought-for function by means 


of the formula 
u (z, i) = g (z, t) + U (2, t) (37) 


where U (z, t) is the new unknown function. To derive the diffe- 
rential equation and the corresponding subsidiary conditions for 
U (z, t) we must substitute expression (37) into all the equalities 
(34)-(36). This results in 


au au ay ô au 
OY a +f le, ptei — Sr =¢ -za tE (x, t) 
U leno = x1 (t)— 8 |e=0 =9, U = %2 (t) —8 [>= =9 (38) 


au ô 
U l-0 = [9 (x) —8 lio] = (2), Cae ETO a AG 

where F, ®© and ¥ designate the expressions in the square brackets 
(let the reader perform all the calculations). When deducing equa- 
lities (38) we have taken into account that the function g (x, t) 
satisfies boundary conditions (35). Thus, to find U (z, t) we must 
now solve a problem which is completely analogous to the problem 
solved above, the only difference between them lying in the notation 
and in the particular form of the functions f, and p. After the 
function U has been determined we obtain the solution of the ori- 
ginal problem by means of formula (37). 
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Gradient 366 
Graph of a function 45, 54-56 
Graphical method of representing 
functions 44 
Greatest lower bound 170 
Greatest value of a function 131, 
168-170, 389, 390 
Green’s 
formula 639 
function 492ff, 539, 540, 622 
Guldin’s 
first theorem 484 
second theorem 602 


Half-life of a radioactive element 507 

Harmonic analysis 696 

Harmonic oscillations 72, 73, 269 

Heat equation 782 

Heaviside unit function (step func- 
tion) 494 
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High-speed electronic computer 757, 
762 


Hilbert space 707 
Hodograph 249 
Homogeneous function 300 
Hooke’s law 780 
Hyperbola 66, 99-102 
canonical equation of 100 
centre of 100 
conjugate axis of 100 
eccentricity of 103 
focuses of 100 
principal axes of 100 
transverse axis of 100 
vertices of 100 
Hyperbolic point of a surface 383 
Hyperboloid(s) 324-326 
of one sheet 324 
of revolution 324, 326 
of two sheets 325 
Hyperplane 243 
Hypocyeloid 90 
Hypothesis of plane sections 780 


Image (under a mapping) 340 
Imaginary 
axis 260 
part of a complex number 259 
unit 259 
Improper integral 454ff 
absolutely convergent 461, 617 
comparison test for convergence 
of 459ff 
conditionally convergent 462 
convergent 455, 615 
dependent on a parameter 476ff 
divergent 455, 615 
divergent to infinity 455 
multiple 6415ff 
dependent on a parameter 
617ff 


oscillating divergent 457 
properties of 458-464 
Improper rational fraction 277 
Increment of a variable (function) 60, 
139, 140, 294, 296 
Independent equations 301 
Indeterminate forms 113, 116, 127, 
129, 130, 132, 158-160 
Index of summation 118 
Infinite interval 30 
Infinitesimals 109-112 
comparison of 121-122 
_ equivalent 424 
Trittin conditions 499, 500, 524, 712, 


Initial phase of a harmonic 72 
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Initial value problem 500, 784 
Input 
of a computer 763 
variable 758 
Instantaneous velocity 135, 250 
Instruction 763, 769ff 
Integrable combination 520, 527 
Integral(s) 
cosine 449 
curve 499, 523 
curve of a differential equation 
499 
definite 420 
dependent on a parameter 474-478 
continuity with respect to the 
arameter of 475 
differentiation with respect 
to the parameter of 475, 496 
integration with respect to 
the parameter of 476 
double 587 
geometric meaning of 592 
exponential 449 
Fresnel’s 448 
improper 454, 615 
indefinite 394 
multiple 587 
applications of 589ff 
of higher order 622ff 
properties of 587ff 
of a periodic function 430, 434 
over a manifold 622ff 
over a plane figure 599ff 
over a rectangle 596ff 
proper 454 
sine 449 
Stieltjes 624 
sum 419, 586, 587, 597 
surface 602ff 
test for convergence of a series 
647 


triple 587 
volume 604ff 
with respect to a measure 620-622 
Integrand 394, 420 
Integrating 
factor 511 
functions by means of series 450, 
467, 468 
inequalities 433 
Integration 
elementary methods of 393-404 
element of 394, 420 
by change of variable (by substi- 
tution) 402, 428, 429 
by pean of differentiation 547, 
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Integration 
y parts 400, 428 
by quadratures 503 
limits of 420 
mechanical 450 
numerical 450-454 
of irrational functions 407-412 
of rational functions 405-407 
of trigonometric functions 412-415 
Interpolation 45, 60, 182, 191, 196, 
287, 452 
Interval 30 
of constancy of a function 49 
ot Conyergenee of a power series 


of integration 420 
of monotonicity of a function 
49, 165, 166 
Invariance of the form of the diffe- 
rential 152, 299 
Invariant 92 
Inverse interpolation 198 
Inversion of order of integration 598 
Isocline 502 
Isolated singular point of a plane 
curve 372 
Isomorphism 
of Euclidean spaces 247 
of linear spaces 243 


Jacobian 302 
Joint distribution 737ff 
Jump discontinuity 127, 289 


Lagrange’s 
differential equation 517 
form of the remainder of Taylor’s 
series 165 
iorno ation formula 191, 192, 


method of undetermined multi- 
pliers 386 
method of variation of arbitrary 
constants 506, 531, 535, 554, 
555 
theorem (on finite increments) 187 
Lamé’s coefficients 608, 611, 644 
Laplace’s equation 784, 794 
Law of large numbers 753 
Law of motion 87 
Law of refraction 173 
Least-square method 380, 381 
Least 
upper bound 170 
value of a function 131, 168-170, 
389, 390 
Lebesgue measure 620, 623, 642, 739 


808 SUBJECT INDEX 


Left-hand screw rule 227 
Legendre’s polynomials 689 
Leibniz 
formula (for differentiation of 
an integral with respect to the 
paame on 
rule (for differentiation of a pro- 
duct) 156 
test for convergence of a series 
650 


Lemniscate 94 


Level 
lines 285, 374 
surfaces 292, 368 
L’Hospital’s rule 158, 159 
Limit(s) 113-117 
inferior 115 
infinite 115 
left-hand 126 
of a ee expression 


of integration (upper, lower) 420 
point 114 
properties of 145-417 
right-hand 126 
superior 115 
Linear 
algebra. 238, 244 
approximation 287 
combination of vectors 216, 241 
differential equations 505, 506, 
528-535, 541-549, 553ff 
extrapolation 61, 288 
function 53, 60 
interpolation 61, 182, 287, 452 
law of elasticity 492, 498 
operator 493, 528, 554 
space 237° 
Linearization 154, 183 
Linearly dependent (independent) func- 
tions 530 
Linearly dependent (independent) vec- 
tors 217, 241 
Line integral 
of the first type 479 
of the second type 482ff 
Localized (bound) vector 213 
ee (of memory of a computer) 


Logarithmic 

scale 28 

spiral 85, 86 
Lower bound 30, 31 
Lyapunov stability 558ff 


Machine translation 769 
Maclaurin’s series 164 


Magnetic core, drum and tape 768, 
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Mapping 282, 339ff 
degenerate 364 
eigenvalue of 350 
eigenvector of 350 
identity 347 
into a space 340, 341 
inverse 344, 
isometric (orthogonal) 353 
linear 340, 344 
matrix of 342 
nonlinear 358-362 
one-to-one 359 
onto a space 341 
Marginal distribution 738 
Mass-scale phenomena 722 
Mathematical physics 780 
Mathematical statistics 756 
Matrix (matrices) 329 
addition of 334 
characteristic equation of 335 
column 330 
complex 333 
degenerate (singular) 334 
determinant of 331 
diagonal 330 
eigenvalue of 335 
eigenvector of 335 
form of a system of linear diffe- 
rential equations 557 
inverse 333 
multiplication of 332 
non-degenerate (non-singular) 334 
` of a linear mapping 342 
operations on 331-333 
order of 330 
orthogonal 352 
principal diagonal of 330 
rank of 337 
row 330 
ae symmetric (antisymmetric) 


square 330 
symmetric 334 
properties of 353ff 
transposed 330 
transposed conjugate 333 
unit 330 
zero 330 
Maximum 
of a function of one variable 167 
of if onnon of several variables 


Mean 
density 435 
deviation 664 
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Mean 
square deviation 664 
value 
of a function 434, 589 
of a random variable 741 
Measure 586, 620ff, 642, 739 
theorem 434 
Memory (storage) of a computer 762 
Method(s) 
Adams 582, 583 
approximate, for solving differen- 
tial equations 562ff 
collocation 275, 280 
combined 183 
cut-and-try 181, 182 
decomposition (for integrals) 398 
direct (for finding an extremum) 


388 
Euler’s (broken line) 578, 579 
iterative 185-187 
for solving differential equa- 
tions 562 
for systems of linear equa- 
tions 208, 209 
for systems of non-linear 
equations 394 
Lobachevsky’s 274 
Milne’s 583, 584 
Newton’s 182 
for systems off equations 
394 
numerical, for solving differen- 
tial equations 578-594 
of chords 182 
of elimination 208 
of least squares 380, 381, 575 
of moments 575 
of parallel sections 322 
of separation of variables 786ff 
of steepest descent 387 
of tangents (Newton’s) 182 
of plea ne coefficients 278ff, 


of variation of arbitrary constants 
(parameters) 506, 531, 535, 554, 


55 
Runge-Kutta 580-582 
Seidel’s. 394 
simplification 577 
small parameter (perturbation) 


189, 569ff 
Minimax 377 ` 
Minimum 


of a function of one argument 167 
Minor 

of a determinant 203 

of a matrix 337 
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Mixed 
partial derivatives 304 
problem 785 
(triple) scalar product 235 
Mobius strip 624 
Modulus 
of a vector 242 
of elasticity (Young’s modulus) 
538, 780 
Moment(s) 
of a random variable 746 
of a vector about a point 233 
of inertia 538, 596 
static 590, 596 
Monotonicity, intervals of 49, 165, 


166 

Multiple root of an equation 272 
Multiple-valued function 49 
Multiplication 

of approximate numbers 36-39 

of a vector by a scalar 215, 216 

of complex numbers 261 

of operators 550 

rule of probability theory 728 
Multiplier(s) 

Lagrange’s 386 

(unit of a computer) 759 
Multiply connected domain 487 


Natural (fundamental) frequency 533 

Natural (Napierian) logarithms 68 

n-dimensional manifold (space) 310 

Necessary condition for convergence 
of a series 120 

eee. conditions for an extremum 


Negative of a vector 245, 238 
Neighbourhood 34 
Neumann’s problem 786 
Newton-Leibniz theorem 427 
Newton’s 

a wi ee formula 196-198, 


law of gravitation 643, 619 
method (of tangents) 182 
Nodal pont 90, 372, 545 
Node of a standing wave 787 
Nomography 285, 286 
Non-orientable surface 624 
Non-perturbed solution of a differen- 
tial equation 573 
Non ene (one-sided) constraints 


Non-trivial solution of a homogeneous 
system 335 
Norm of a vector 244 
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Normal acceleration 252 ; 
Normal form of a system of differen- 
tial equations 522, 524 
Normal (Gaussian) law 736, 739, 750 
Normal plane (to a curve) 254 
Normal section of a surface 384 
principal 382 
Normal to a curve 137 
Normalization 244, 689, 754 
factor 739 
Number 
complex 259 
conjugate 263 
e 68, 124, 164, 509 
imaginary 260 
pure imaginary 260 
real 260 
scale 27 
system 
binary 764 
binary-decimal 765 
octal 765 
vector 330 
Numerical 
characteristics of random vari- 
ables 741-748 
integration 450-454, 649 
solution 
of ome equations 273- 


of differential equations 578- 
594 


‘One-parameter family of curves 372 
Open interval 30 
Operation(s) 
commuting 347 
non-commuting 347 
Operator(s) 342, 493 
commuting 550 
difference 550 
differential 528, 549, 550 
equation 552 
linear 493, 528, 554 
method of solving differential 
equations 552, 55: 
non-commuting 550 
non-linear 554 
of differentiation 473 
of multiplication by a number 
(by a given function) 550 
power series of 554 
powers of 554 
shift 550 
unit 550 
zero 550 
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Optical properties of conic sections 147, 
148 


Order of an algebraic curve 90 
Orders of smallness 124 
Orientable 
manifold 623 
surface 624 
Orientation of a surface 227 
Orthogonality 
of functions 686, 687, 703 
of vectors 224, 246 
with weight function 708 
Orthogonalization 247, 689 
Orthogonal polynomials 688 
Orthogonal vectors 224 
Oscillating divergent series 652 
Oscillations 
damped 66, 109, 114, 544 
forced 532, 547, 548, 783 
free 497, 544, 548, 783 
harmonic 72, 73, 269 
undamped 114 
Osculating circle 254 
Osculating plane 252 
Ostrogradsky’s theorem 630, 782 
Overflow 771, 789 
Overtone 697 


Parabola 62 

axis of 62 

cubic 64 

quadratic 62 

safety 373 

semicubical 65 

vertex of 62 
Parabolic point of a surface 383 
Paraboloid 

elliptic 326 

pe fee 327 

of revolution 316, 326 
Parallelogram law for addition of 

vectors 213 

Parameter 27 
Parametric representation 

of a curve 87 

of a function 87, 318 

of a surface 317 
Parseval relation (theorem) 

for Fourier series 704 

for Fourier transform 719 
Partial 

derivative 294 

of a composite function 298 
difference 303 
quotient 304 
differential 294, 303 
differential equation 498 
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Partial 
increment 294, 296 
rational fraction 278, 280 
Period of a function 50, 54 
Perturbed solution 189 
Phase 
of a harmonic 72 
space 525 
trajectory 525 
Planar point of a surface 384 
Planimeter 450 
Point of discontinuity of a function 
49, 126, 127 
Point of inflection 64, 173 
Poisson 
equation 784 
law 735 
Polar 
angle 81 
coordinates 84 
radius 84 
Polynomial 52, 271ff 
Potential of a force 456 
Practically impossible event 726, 
734 
Preimage (original, or inverse image) 
340 


Primitive period 54 
Principal 
normal 252 
value of a divergent integral 473 
value of an inverse trigonometric 
function 74 
Probability 723 
a posteriori 730 
a priori 730 
conditional 727 
distribution 733 
conditional 746 
integral 737 
roperties of 725-727 
Problem of two bodies 104 
Program (for a computer) 44, 767ff 
Programming 764ff, 774ff 
Projection of a vector 219 
Proper integral 454 
Proper rational fraction 277 
Properly divergent series 645 
Properties 
of continuous functions 129-134 
of definite integral 426-431 
of derivatives 139-142 
of indefinite integral 397-399 
of infinitesimals 111, 112, 122 
of limits 115-4117 
Pseudoscalar 234 
Pseudovector 234 


Punch 
card 762 
tape 767 
Pythagoras’ theorem 79, 224, 707 


Quantity 25 
constant 26 
dimension of 25 
dimensionless 25 
variable 26 ` 

Quadrants 79 

Quadratic form 355 
matrix of 356 
negative definite 379 
positive definite 379 
reduction to a diagonal form 

of 357 
Quadratic function 53, 62 


Radioactive decay 507 
Radius of convergence of a power 
series 667, 676 
Radius-vector 222 
Random 
event(s) 724 

certain (sure) 725 
impossible 725 
independent 728 
mutually exclusive 727 
opposite (contrary) 726 
poniely impossible 726, 


variable 732 
continuous 732 
discrete 732 
multidimensional 738 
normalized 754 
uniformly distributed 736 
vector 739 
Range 
of a function 48 
of a variable 30 
Rationalization of integrals 407 
Real part of a complex number 259 
Real time simulation 764 
Reducing the order of a differential 
equation 519ff 
Reduction of a higher-order differen- 
tial equation to a system of first- 
order equations 521, 522 
Regression 
curve 747 
function 747 
linear 748, 756 
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Relative frequency 722 

Remainder 120, 162, 165 

Resolution of a vector along given 
vectors (axes) 217, 218 

Resonance 

Riccati’s equation 506, 563 

Riemann zeta function 656 

Right-hand screw rule 227 


oot 
multiple 274 
of a function 272 
of a polynomial 271 
simple 272 
Rotation (curl) of a vector field 636, 


642 
Roulette 89, 258 
Rounding 34 
Routine (program) 
Rule of a reserve 


767 
decimal digit 35 


Saddle point 515 
Sampling 721, 
with replacement 729 
without replacement 729 
Scalar 242 
product of functions 706 
product of vectors 220, 2214, 244 
expression in Cartesian coor- 
dinates of 224 
properties of 221, 222 
square of a vector 222 


cale 
factor 62 
logarithmic 28 
non-uniform 28 
uniform 27 
Screw line (circular helix) 249 
Sequénce 48 : 
Serial storage access 769 
Series ` 
alternating 650 
application to solving differen- 
tial equations of 564-572 
convergent 118 
absolutely 650, 658, 660 
conditionally 650 
divergent 119, 645, 652 
functional 664 ff 
uniform convergence of 663 
general term of 118 
harmonic 649 
in orthogonal functions 689ff 
multiple 659ff 
numerical 147 
operations on 652-654 
oscillating divergent 652 
partial sum of 118, 645 
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Series 

positive 645 

power 666-677 

properly divergent 645 
rearranging the terms of 653, 


remainder of 120 
sum of 118 
Sign of double substitution 425 
Sign of identity 50 
Similarity coefficient 86 
Similarity transformation 86 
Simplifying general equation of an 
algebraic surface of the second 
order 327, 328 
Simply connected domain 487 
Simpson’s formula 452, 453, 649 
Sine integral 449 
Single-valued function 48 
Singular 
curve of a differential equation 
5 


2 
integral curve 514 
lines on a surface 384 


t 
of a curve 372 
of a differential equation 
512, 524, 566 
of a surface 371, 384 
solution of a differential equation 
Sink 629 
Sinusoid 72 
Slide rule 28, 29 
Sliding vector 213 
Slope of a straight line 60 
Source 629 
Space 
Euclidean 244 
Hilbert 707 
linear 237 
n-dimensional Cartesian 239 
of events 310 
topological 310 
Specific heat 782 
Spectral density 715 
Spectrum 
continuous 713 
discrete 713 
of a problem 538 
Spheroid 323 
Spinode 65 
Standard deviation 745 
Bei eh methods of integration 404- 
Standing wave 787 
State space 525 
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Stationary (critical) point 376 
Stationary (critical) value 166, 173 
Step of a table 43 
Stieltjes integral 624 
Stifiness factor 498 
Stokes’ theorem 640 
Storage system, parallel 769 
Subroutine (subprogram) 768 
Subsidiary conditions (for a condi- 
tional extremum) 38: 
Subspace (submanifold) 239, 342 
Substitution 
hyperbolic 409 
trigonometric 408 
Subtraction 
of approximate numbers 34, 36 
of vectors 215 
Sufficient conditions for an extremum 
167, 168, 377-379 
Summation sign 148 
Sum of vectors 213, 244 
Superposition principle 493, 531, 554 


Symmetric form of a system of diffe- 
rential equations 523 
System A 
of first-order approximation 560 
of first-order differential equations 
522-526 
of linear algebraic equations 
206-214 
homogeneous 214 
of linear differential equations 
553ff 


Tabular integrals 396 

Tabular method of representing func- 
tions 43 

Tangent curve 73 

Tangent plane to a surface 368, 369 

Tangential acceleration 252 


Taylor’s 
formula 161, 374, 375 
series 163 
Term-by-term 
differentiation of a series 666, 668 


integration of a series 665, 668 
Test for distinguishing multiple roots 


27 
Theory of probability 724 
Torricelli’s law 438 
Total 
derivative 368 
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Total 
differential 296, 297, 305 
increment 296 
Trajectory of motion 87 
Transcendental curves 90 
Transcendental surfaces 314 
Transfer instruction 771, 775 
Transformation of a matrix 
of a mapping 347ff 
of a quadratic form 356 
Transient process 128, 559 
Translation of a vector 213 
Transposing a determinant 203 
Trapezoid rule 454 
Trial 724 
Triangle inequality 245 
Trigger 768 
Trigonometric form of a complex 
number 260 
Trigonometric series 686-690 
Triple 
scalar product of vectors 235 
vector product of vectors 236 
Trivial solution of a homogeneous 
system 335 
Two-state element 768 
Two-way series 702 


Umbilical point of a surface 382 

Uncertainty principle 718 

Unconditional transfer of control 775 

Undetermined coefficients, method of 
278ff, 545, 565, 566 

Unfavourable outcome (case) 724 

Unilateral (one-sided or non-restrict- 
ing) constraints 388 

Unit vector 224 

Unperturbed solution 189 

Upper bound 30, 34 


Variable 26, 248 
bounded 34 
bounded above 34 
bounded below 34 
continuous 29 
dependent 39 
discrete 29 
independent 39 
infinitely large 142 
infinitesimal 109 
monotonic 30 
oscillating 1414 
point 28 
random 732 
range of 30 
vector 248 
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Variance (dispersion) 744 
Variation 572 
Variational equation 573 
Vector 212, 218 
eek eee of 247 
coordinates of 249 
field 367, 626ff 
function of a scalar argument 248 


modulus of 242 
moment of 233 
negative of 245 
of an area 228 
origin of 212 
product of vectors 228 
expression in Cartesian coor- 
dinates of 232 
properties of 230, 234 
projection of 219, 220 
solution of a system of differen- 
tial equations 557 
terminus of 212 
Vectorial angle 84 
' Vector paramere equation of a curve 


Velocity 
average 134 
convective 368 
instantaneous 135 
Vertex of a curve 254, 257 
Volume of a solid 445ff 


Wave equation 784 

Wave length 73 

Wave number 73 

Weierstrass’ test for uniform conver- 
gence of a functional series 663 


Young’s modulus 538, 780 


Zero(s) 
line of 290 
of a function 133, 272 
of a polynomial 271ff 
Zero-dimensional space 241 
Zero matrix 330 
Zero vector 215 
Zeta function 650 
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Constants 
e 68; Bn 677; Bn 678; (x) 156, 165; C 650. 


Functions (general notation) 
f(x) 40; y |x-a 40; f(x, y) 44; z |= 41; 
y= 
f (@ (a)) 42; y (x) 42. 


Functions (special notation) 
Inz69; expz70; sinhx70; cosh270; tanhz70; sinh z, cosh 2, 
tanh 271; Ln z267; Erf x448; C (x), S (x), Ei (x), Si (x), Ci (£) 449; 
T (p)469; B(p, q)471; &(x) 489; e(x)491; G(x, &)492; J, (x) 568; 
Yp(x), Np (x) 569; €(p)655; Pp (x) 688, 689; Tn (x) 709; @ (t) 137, 
753, 754. 
Theory of Limits 

<€, > 74, 121; 00 112; limz+a113; lim, lim 115; a~ P4121; o, 


O 121, 125; min, max 134; sup, inf 170; f (x) ~a)+2+ ... 686. 
Finite Differences 
A, Ax 60, 125; A, 192; AR 193, 194; Axu 294. 
Differential Calculus 
y’ 136; dy 149, 296; cI 150; z 152; y", y™® 155, 156; 2; dy, dy, 
dW. 156, 157; u% 294; du, 2E 295; 2l o) tu 
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Vectors 


a, AB, a| 212; 0 215; proj; a 249; a-b, (a, b) 220, 244; r, e 221; 
222; S 228; a Xb, [a, b] 230; En 239; Zn 245. 


Complex Numbers 
Re, Im 259; |z|, argz, Argz 260; 2*, z 263. 


Matrices 
ab ab : 
an 200; (? 2) , (dij)mn, A 329; A* 331, 333; 
diag (a, b, c), I 330; det A 334; rank Bote 


Field Theory 
9 365: gradu, grad;w366; div A630; rot A, rotn A636. 


Integral Calculus 


: : 
fro dx 394; j f(z)dx420; F(z) le 425, 426; f 434, 589; 


a 


§ 484, 629; f waa 587; f f 599. 
: (2) (Q) 
Probability Theory 
P {4}723; P{A| B}727; f 733; E M{&}, 
ME 741; DE, D{Ẹ) 744; rg, q 748. 
Some Other Symbols 
EMI 25; = 33, 37; =50; 117,118; € 237; 
E 237; (R) 237; Ly 707. 


parse re PSs a Me SIN Be o 


Printed in the Union of Soviet Socialist Republics 


The author, Prof. Anatoly MYSK:. 
D.Sc., is well known not only for ‘ 
original research, but also for his eq 
ally original approach to the teachit 
of higher mathematics. He is one j 

~ the founders of the theory of diff 
rential equations with retarded arg 
ment. 


His publications include LINEAR DIFF 
RENTIAL EQUATIONS WITH RETAI 
DED ARGUMENT, ELEMENTS ( 
APPLIED MATHEMATICS (co-author 
and SPECIAL COURSES IN MATH 
MATICS FOR TECHNICAL COLL 
GES. 


Othe, - for Your Library 


NANDBOOK 
OF Hens MATHEMATICS 
BY 
M. /YSODSKY, DSC: 
Intended for students -į -..gineers, teachers and sixth-form 


pupils as a practical r: eren e book, or as a compact study 
aid giving elementary ©<qaintance with the subject. Contains 
material on the history >! mathematical ideas and brief biogra- 
phical notes on the me~hamaticians who developed them. 


eS DUES 
AND T if PPLICATIONS 
Y 
=LFOND 
An introduction to the m ü of residues, one of the classical 
mathematical methods ‘finding wide use in many fields of 


science. The book assumes knowledge of the fundamentals of 
the theory of the functions of:a complex variable, including 
the concept of the integra! and Cauchy’s theorem. 


THEORY OF THE FUNCTIONS 
OF A COMPLEX VARIABLE 
BY 
A. SVESHNIK@V, D.SC., AND A, TIKHONOV 


A textbook for students not without interest for theoretical 
physicists working in the fields of hydrodynamics and electro- 
statics; can be used as a reference book by, post-graduate 
students and research workers. 


Mir Publishers 
Moscow 


= 


